[Info-vax] x86-64 data aligment / faulting

Arne Vajhøj arne at vajhoej.dk
Thu Mar 3 10:08:59 EST 2022


On 3/3/2022 9:02 AM, Joukj wrote:
> Arne Vajhøj wrote:
>> Code below in case anybody wonder WTF I am doing.

>>       PROGRAM TEST_ALIGN
>>       INTEGER*4 N,REP,OFFSET
>>       PARAMETER (N=2500000,REP=100,OFFSET=17)
>>       BYTE DUMMY(OFFSET)
>>       REAL*8 X(N+(OFFSET-1)/8+1)
>>       EQUIVALENCE (DUMMY,X)
>>       INTEGER*4 I,J,SCALE(4)
>>       DATA SCALE/1000,100,10,1/
>>       DO 200 J=1,REP
>>         DO 100 I=1,N+1
>>           X(I)=I
>> 100     CONTINUE
>> 200   CONTINUE
>>       DO 400 J=1,4
>>         WRITE(6,700) N/SCALE(J),8*N/SCALE(J),REP*SCALE(J),
>>      +               'SEQUENTIAL ACCESS'
>>         DO 300 I=1,OFFSET
>>             CALL TEST(I,DUMMY(I),N/SCALE(J),REP*SCALE(J),.FALSE.)
>> 300     CONTINUE
>> 400   CONTINUE
>>       DO 600 J=1,4
>>         WRITE(6,700) 2**(9+3*J),8*2**(9+3*J),REP*SCALE(J)/6,
>>      +               'RANDOM ACCESS'
>>         DO 500 I=1,OFFSET
>>             CALL TEST(I,DUMMY(I),2**(9+3*J),REP*SCALE(J)/6,.TRUE.)
>> 500     CONTINUE
>> 600   CONTINUE
>> 700   FORMAT(1X,'DATA SIZE = ',I7,' (',I8,' BYTES), REP = ',I6,', ',A)
>>       END
>>
>> and:
>>
>>       SUBROUTINE TEST(IX,X,N,REP,RANACC)
>>       INTEGER*4 IX,N,REP
>>       REAL*8 X(*)
>>       LOGICAL*4 RANACC
>>       INTEGER*4 I,T1,T2,DUMMY,RANIX
>>       CALL SYSTEM_CLOCK(T1,DUMMY,DUMMY)
>>       DO 200 J=1,REP
>>         IF(RANACC) RANIX=0
>>         DO 100 I=1,N
>>           IF(RANACC) THEN
>>             RANIX=MOD(401*RANIX+1,N)
>>             X(RANIX+1)=I
>>           ELSE
>>             X(I)=I
>>           ENDIF
>> 100     CONTINUE
>> 200   CONTINUE
>>       CALL SYSTEM_CLOCK(T2,DUMMY,DUMMY)
>>       IF(RANACC) THEN
>>         WRITE(6,300) IX-1,MOD(LOC(X),8),T2-T1
>>       ELSE
>>         WRITE(6,300) IX-1,MOD(LOC(X),8),T2-T1
>>       ENDIF
>>       RETURN
>> 300   FORMAT(1X,'OFFSET ',I2,' (ADDRESS MOD 8 = ',I1,'): ',I6,' ms')
>>       END
>>
> 
> As others commented: The compiler may have optimized the allignment away 
> by calling subroutine test.

They are compiled separately.

In the main program test is being called with a byte array. The compiler
does not know that it is being treated as floating point array in the
subroutine.

In the subroutine it believes it receives a floating point array. The
compiler does not know that it is actually being called with a byte
array.

Let us say that the subroutine is being called with the address
1028 and array size 3.

I consider it valid compiler behavior to:
- update 3 FP's at 1028, 1036 and 1044
- throw an error for unaligned access

I consider it a compiler bug to:
- update 3 FP's at 1024, 1032 and 1040
- update 3 FP's at 1032, 1040 and 1048
- update 2 FP's at 1032 and 1040

Arne





More information about the Info-vax mailing list