[Info-vax] Intel previews new Itanium "Poulson" processor

John Reagan johnrreagan at earthlink.net
Thu Feb 24 18:38:39 EST 2011


"John Wallace" <johnwallace4 at yahoo.co.uk> wrote in message 
news:82372da4-1292-47c1-957a-955669c2cb9b at u6g2000vbh.googlegroups.com...
On Feb 24, 8:34 pm, JF Mezei <jfmezei.spam... at vaxination.ca> wrote:
>>
>> Also, just so I understand correctly. If they have 12 instead of 6
>> execution units, is it correct to state that a program compiled for
>> Tukwila, will still use only 6 units on Poulson, leaving the remaining 6
>> iddle ?

>"a program compiled for Tukwila, will still use only 6 units on
>Poulson, leaving the remaining 6 iddle ?"

>That was the general EPIC principle. The compiler must see the
>parallelizable stuff in a given block of code and construct bundles of
>instructions accordingly. If the execution environment widens, a
>recompile with the matching new compiler will be needed to make use of
>the VVLIW capabilities (previously IA64 was a Very Long Instruction
>Word, now it's a VVLIW). Some reports are quoting Intel as saying that
>users should not have to recompile to take advantage of the 12-
>instruction issue, which is a rather strange thing to say about
>different generations of EPIC machines and compilers.


Why do you folks keep repeating the same untruth over and over...  sigh...

When GEM (or any compiler) looks for parallel sequences in the program, it 
just
doesn't stop at 6 instructions.  It will find the longest sequence that its 
algorithms
can determine (I'm sure those could be improved with additional work, 
nothing
is perfect).  It doesn't toss in a stop bit every six instructions just for 
fun.  The
compiler has already seen your program.

Now as many have pointed out, many programs don't get long runs of parallel
instructions.  Memory operations, etc. almost always introduce some 
dependency that
requires the compiler to insert a stop bit.   Thats why the compilers do all 
sorts of
loop unrolling, etc. to increase the chance of parallel sequences. 
Itanium's advance
loads and speculative loads help as well (I'll admit that GEM does a poor 
job of using
those instructions - it would need additional work to improve in this area. 
That
isn't a Poulson thing but would help on all chips.)

And it isn't 6 "UNITS" or 12 "UNITS".  It is how many instructions are 
tossed into the
chip every cycle.  Tukwilla grabs 6 slots (2 bundles) and starts chewing 
every cycle.
Stop bits will make it chew slower.  No stop bits allows it to chew faster. 
With
Poulson, it will now grab 12 slots (4 bundles) every cycle.  And of course, 
branches,
calls, returns, etc. make the chip slow down and even spit out a few things.

John






More information about the Info-vax mailing list