[Info-vax] Intel previews new Itanium "Poulson" processor
John Reagan
johnrreagan at earthlink.net
Thu Feb 24 18:38:39 EST 2011
"John Wallace" <johnwallace4 at yahoo.co.uk> wrote in message
news:82372da4-1292-47c1-957a-955669c2cb9b at u6g2000vbh.googlegroups.com...
On Feb 24, 8:34 pm, JF Mezei <jfmezei.spam... at vaxination.ca> wrote:
>>
>> Also, just so I understand correctly. If they have 12 instead of 6
>> execution units, is it correct to state that a program compiled for
>> Tukwila, will still use only 6 units on Poulson, leaving the remaining 6
>> iddle ?
>"a program compiled for Tukwila, will still use only 6 units on
>Poulson, leaving the remaining 6 iddle ?"
>That was the general EPIC principle. The compiler must see the
>parallelizable stuff in a given block of code and construct bundles of
>instructions accordingly. If the execution environment widens, a
>recompile with the matching new compiler will be needed to make use of
>the VVLIW capabilities (previously IA64 was a Very Long Instruction
>Word, now it's a VVLIW). Some reports are quoting Intel as saying that
>users should not have to recompile to take advantage of the 12-
>instruction issue, which is a rather strange thing to say about
>different generations of EPIC machines and compilers.
Why do you folks keep repeating the same untruth over and over... sigh...
When GEM (or any compiler) looks for parallel sequences in the program, it
just
doesn't stop at 6 instructions. It will find the longest sequence that its
algorithms
can determine (I'm sure those could be improved with additional work,
nothing
is perfect). It doesn't toss in a stop bit every six instructions just for
fun. The
compiler has already seen your program.
Now as many have pointed out, many programs don't get long runs of parallel
instructions. Memory operations, etc. almost always introduce some
dependency that
requires the compiler to insert a stop bit. Thats why the compilers do all
sorts of
loop unrolling, etc. to increase the chance of parallel sequences.
Itanium's advance
loads and speculative loads help as well (I'll admit that GEM does a poor
job of using
those instructions - it would need additional work to improve in this area.
That
isn't a Poulson thing but would help on all chips.)
And it isn't 6 "UNITS" or 12 "UNITS". It is how many instructions are
tossed into the
chip every cycle. Tukwilla grabs 6 slots (2 bundles) and starts chewing
every cycle.
Stop bits will make it chew slower. No stop bits allows it to chew faster.
With
Poulson, it will now grab 12 slots (4 bundles) every cycle. And of course,
branches,
calls, returns, etc. make the chip slow down and even spit out a few things.
John
More information about the Info-vax
mailing list