[Info-vax] x86-64 data aligment / faulting
Arne Vajhøj
arne at vajhoej.dk
Thu Mar 3 10:08:59 EST 2022
On 3/3/2022 9:02 AM, Joukj wrote:
> Arne Vajhøj wrote:
>> Code below in case anybody wonder WTF I am doing.
>> PROGRAM TEST_ALIGN
>> INTEGER*4 N,REP,OFFSET
>> PARAMETER (N=2500000,REP=100,OFFSET=17)
>> BYTE DUMMY(OFFSET)
>> REAL*8 X(N+(OFFSET-1)/8+1)
>> EQUIVALENCE (DUMMY,X)
>> INTEGER*4 I,J,SCALE(4)
>> DATA SCALE/1000,100,10,1/
>> DO 200 J=1,REP
>> DO 100 I=1,N+1
>> X(I)=I
>> 100 CONTINUE
>> 200 CONTINUE
>> DO 400 J=1,4
>> WRITE(6,700) N/SCALE(J),8*N/SCALE(J),REP*SCALE(J),
>> + 'SEQUENTIAL ACCESS'
>> DO 300 I=1,OFFSET
>> CALL TEST(I,DUMMY(I),N/SCALE(J),REP*SCALE(J),.FALSE.)
>> 300 CONTINUE
>> 400 CONTINUE
>> DO 600 J=1,4
>> WRITE(6,700) 2**(9+3*J),8*2**(9+3*J),REP*SCALE(J)/6,
>> + 'RANDOM ACCESS'
>> DO 500 I=1,OFFSET
>> CALL TEST(I,DUMMY(I),2**(9+3*J),REP*SCALE(J)/6,.TRUE.)
>> 500 CONTINUE
>> 600 CONTINUE
>> 700 FORMAT(1X,'DATA SIZE = ',I7,' (',I8,' BYTES), REP = ',I6,', ',A)
>> END
>>
>> and:
>>
>> SUBROUTINE TEST(IX,X,N,REP,RANACC)
>> INTEGER*4 IX,N,REP
>> REAL*8 X(*)
>> LOGICAL*4 RANACC
>> INTEGER*4 I,T1,T2,DUMMY,RANIX
>> CALL SYSTEM_CLOCK(T1,DUMMY,DUMMY)
>> DO 200 J=1,REP
>> IF(RANACC) RANIX=0
>> DO 100 I=1,N
>> IF(RANACC) THEN
>> RANIX=MOD(401*RANIX+1,N)
>> X(RANIX+1)=I
>> ELSE
>> X(I)=I
>> ENDIF
>> 100 CONTINUE
>> 200 CONTINUE
>> CALL SYSTEM_CLOCK(T2,DUMMY,DUMMY)
>> IF(RANACC) THEN
>> WRITE(6,300) IX-1,MOD(LOC(X),8),T2-T1
>> ELSE
>> WRITE(6,300) IX-1,MOD(LOC(X),8),T2-T1
>> ENDIF
>> RETURN
>> 300 FORMAT(1X,'OFFSET ',I2,' (ADDRESS MOD 8 = ',I1,'): ',I6,' ms')
>> END
>>
>
> As others commented: The compiler may have optimized the allignment away
> by calling subroutine test.
They are compiled separately.
In the main program test is being called with a byte array. The compiler
does not know that it is being treated as floating point array in the
subroutine.
In the subroutine it believes it receives a floating point array. The
compiler does not know that it is actually being called with a byte
array.
Let us say that the subroutine is being called with the address
1028 and array size 3.
I consider it valid compiler behavior to:
- update 3 FP's at 1028, 1036 and 1044
- throw an error for unaligned access
I consider it a compiler bug to:
- update 3 FP's at 1024, 1032 and 1040
- update 3 FP's at 1032, 1040 and 1048
- update 2 FP's at 1032 and 1040
Arne
More information about the Info-vax
mailing list