[Info-vax] x86-64 data aligment / faulting

Arne Vajhøj arne at vajhoej.dk
Sun Feb 27 20:56:15 EST 2022


On 2/26/2022 11:50 PM, Bob Gezelter wrote:
> On Saturday, February 26, 2022 at 4:36:58 PM UTC-5, Arne Vajhøj wrote:
>> On 2/25/2022 11:37 PM, Bob Gezelter wrote:
>>> It is easy for a benchmark to measure the incorrect phenomenon.
>> There are lies, damn lies and benchmarks.
>>
>> :-)
>>
>> I tested on a 2 MB array.
>>
>> And I admit that the results can be due to many things.
>>
>> But the numbers sure show a big difference!
>>
>> Fortran/VMS/Itanium:
>>
>> OFFSET 0 : 590 ms
>> OFFSET 1 : 197510 ms
>> OFFSET 2 : 197510 ms
>> OFFSET 3 : 197520 ms
>> OFFSET 4 : 197510 ms
>> OFFSET 5 : 197510 ms
>> OFFSET 6 : 197510 ms
>> OFFSET 7 : 197510 ms
>> OFFSET 8 : 590 ms
>> OFFSET 9 : 197510 ms
>> OFFSET 10 : 197520 ms
>> OFFSET 11 : 197520 ms
>> OFFSET 12 : 197520 ms
>> OFFSET 13 : 197520 ms
>> OFFSET 14 : 197520 ms
>> OFFSET 15 : 197520 ms
>> OFFSET 16 : 580 ms
>>
>> GFortran/Windows/x86-64 (100x more reps):
>>
>> OFFSET 0 : 7473 ms
>> OFFSET 1 : 7285 ms
>> OFFSET 2 : 7301 ms
>> OFFSET 3 : 7301 ms
>> OFFSET 4 : 7269 ms
>> OFFSET 5 : 7208 ms
>> OFFSET 6 : 7191 ms
>> OFFSET 7 : 7192 ms
>> OFFSET 8 : 7519 ms
>> OFFSET 9 : 7285 ms
>> OFFSET 10 : 7270 ms
>> OFFSET 11 : 7285 ms
>> OFFSET 12 : 7270 ms
>> OFFSET 13 : 7207 ms
>> OFFSET 14 : 7176 ms
>> OFFSET 15 : 7176 ms
>> OFFSET 16 : 7473 ms

> One needs to analyze beyond raw performance. In this case, I start
> with the cache organization and related structure. If you "break" the
> cache, buy forcing every reference to be a cache miss, one will
> essentially see the 2x performance loss.
> 
> If the cache is able to gain anything, it will skew the numbers.

I modified the program to test with different data sizes, verified
that the code was indeed working on an unaligned addresses and
tried both sequential and random access to array.

I simply can't get a big difference between aligned and unaligned access.

Data at the bottom.

I am not saying that there are no cases where unaligned data access
make a significant difference.

But I have not been able to come up with a case.

Arne

  DATA SIZE =    2500 (   20000 BYTES), REP = 100000, SEQUENTIAL ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    717 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    718 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    718 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    717 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    718 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    733 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    717 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    718 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    717 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    718 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    718 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    717 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    733 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    733 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    718 ms
  DATA SIZE =   25000 (  200000 BYTES), REP =  10000, SEQUENTIAL ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    717 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    702 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    702 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    718 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    702 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    702 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    702 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    717 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    702 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    718 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    686 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    718 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    702 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    702 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    702 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    718 ms
  DATA SIZE =  250000 ( 2000000 BYTES), REP =   1000, SEQUENTIAL ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    718 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    702 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    717 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    718 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    702 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    717 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    718 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    717 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    702 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    718 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    718 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    702 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    717 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    717 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    718 ms
  DATA SIZE = 2500000 (20000000 BYTES), REP =    100, SEQUENTIAL ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    718 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    702 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    717 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    718 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    717 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    718 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    718 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    717 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    718 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    717 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    718 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    702 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    718 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    733 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    717 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    718 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    702 ms
  DATA SIZE =    4096 (   32768 BYTES), REP =  16666, RANDOM ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    733 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    733 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    734 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    733 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    733 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    733 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    733 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    734 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    717 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    733 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    734 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    748 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    734 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    733 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    733 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    733 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    718 ms
  DATA SIZE =   32768 (  262144 BYTES), REP =   1666, RANDOM ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    561 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    593 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    609 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    577 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    593 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    577 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    593 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    593 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    577 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    577 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    608 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    593 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    593 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    577 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    593 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    577 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    577 ms
  DATA SIZE =  262144 ( 2097152 BYTES), REP =    166, RANDOM ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    609 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    624 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    639 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    624 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    640 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    624 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    640 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    624 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    608 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    624 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    624 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    639 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    624 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    624 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    640 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    624 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    608 ms
  DATA SIZE = 2097152 (16777216 BYTES), REP =     16, RANDOM ACCESS
  OFFSET  0 (ADDRESS MOD 8 = 0):    733 ms
  OFFSET  1 (ADDRESS MOD 8 = 1):    749 ms
  OFFSET  2 (ADDRESS MOD 8 = 2):    749 ms
  OFFSET  3 (ADDRESS MOD 8 = 3):    749 ms
  OFFSET  4 (ADDRESS MOD 8 = 4):    749 ms
  OFFSET  5 (ADDRESS MOD 8 = 5):    748 ms
  OFFSET  6 (ADDRESS MOD 8 = 6):    749 ms
  OFFSET  7 (ADDRESS MOD 8 = 7):    749 ms
  OFFSET  8 (ADDRESS MOD 8 = 0):    733 ms
  OFFSET  9 (ADDRESS MOD 8 = 1):    749 ms
  OFFSET 10 (ADDRESS MOD 8 = 2):    748 ms
  OFFSET 11 (ADDRESS MOD 8 = 3):    749 ms
  OFFSET 12 (ADDRESS MOD 8 = 4):    749 ms
  OFFSET 13 (ADDRESS MOD 8 = 5):    749 ms
  OFFSET 14 (ADDRESS MOD 8 = 6):    749 ms
  OFFSET 15 (ADDRESS MOD 8 = 7):    764 ms
  OFFSET 16 (ADDRESS MOD 8 = 0):    718 ms




More information about the Info-vax mailing list