[Info-vax] Dave Cutler, Prism, DEC, Microsoft, etc.

Bob Koehler koehler at eisner.nospam.encompasserve.org
Tue Dec 1 09:24:00 EST 2009


In article <00cd4d03$0$6691$c3e8da3 at news.astraweb.com>, JF Mezei <jfmezei.spamnot at vaxination.ca> writes:
> 
> Excuse me, but are there really circumstances where the programmer
> ("user" from Mr Reagan point of view) is lying to the compiler ?
> 
> if I have a 100 byte record, and bytes 7 though 10 inclusively contain
> an integer, and I do:
> 
> myint = (int *) &(buffer + 6) ; /* hopefully I have the syntax right*/

   Almost, I think you want (assuming buffer is of type char):

   myint = *((int *) (&buffer + 6));

> In what way am I lying ?  Does the compiler realise that this will be an
> unaligned access or will it be accusing me of lying ?

   If you allow the compiler to lay out your variables, it will pad
   things to prevent that misalignment.  If you tell the compiler that
   it's not allowed to pad, then it can know to generate the necessary
   overhead instructions to access the data without misalignment.

   But if you force the compiler to misalign without knowing it by
   manipulating pointers, then the compiler doesn't know to deal with
   misaligned data.

   Examples:

main ()
{
      int myint;

      #pragma member_alignment // (this is the default)
      struct {
         char a[6];
         int b;
      } buffer;

      myint = buffer.b;
}

   In the above code (main), the C compiler will put padding between and b
   so that b is aligned.  b actually starts at byte 8, not 7.

able()
{
      int myint;

      #pragma nomember_alignment
      struct {
         char a[6];
         int b;
      } buffer;

      myint = buffer.b;
}

   In the above code (able), the C compiler knows that there's no padding
   between a and b.  The compiler can fetch the quadword containing b
   the mask off and shift the bits to get the int into the low end of
   a register.

baker(int *buffer)
{
   int myint = *buffer;
}
charlie()
{
      char buffer[10];
      baker((int *) &buffer[6]);
}

   Now in baker the compiler has been told it has an int pointer, but that
   pointer is forced to be misaligned in charlie.  The compiler can't see 
   that, so it doesn't generate the overhead code to avoid the misalignment.

As examples, here is part of the generated code from EISNER.  Note than 
in MAIN at offset D0, and in BAKER at offset 124, there are LDL 
instructions for loading myint.  This is the Alpha instruction for 
loading an aligned longword (32 bits).  Compare these with able starting 
at offset F4, where two LDQ_U instructions are used to get pieces of the 
buffer surrounding the misaligned address, these instructions fetch 
quadwords (64 bits) from the previous and next aligned address without 
generating misalignment faults by ignoring the low bits of the address.
Then four more instructions are used to get the correct data into myint.
This all happens in able because the compiler knows it's accessing a 
misaligned address, which it didn't know when it compiled baker.


	     00C4	MAIN::                                        
23DEFFE0     00C4		LDA	SP, -32(SP)                   
                                                                     
                                Machine Code Listing             1-DEC
                                MAIN                             1-DEC
                                                                      
47FD0412     00C8		MOV	FP, R18                       
47FB041D     00CC		MOV	R27, FP                       
A03E0010     00D0		LDL	myint, 16(SP)		      
47E03400     00D4		MOV	1, R0			      
47F2041D     00D8		MOV	R18, FP                       
23DE0020     00DC		LDA	SP, 32(SP)                    
6BFA8001     00E0		RET	R26                           
                                                                      
Routine Size: 32 bytes,    Routine Base: $CODE$ + 00C4                
                                                                      
	     00E4	ABLE::                                        
23DEFFE0     00E4		LDA	SP, -32(SP)                   
47FD0414     00E8		MOV	FP, R20                       
47FB041D     00EC		MOV	R27, FP                       
223E000E     00F0		LDA	R17, 14(SP)		      
2E510000     00F4		LDQ_U	R18, (R17)                    
2E710003     00F8		LDQ_U	R19, 3(R17)                   
4A5104D2     00FC		EXTLL	R18, R17, R18                 
4A710D53     0100		EXTLH	R19, R17, R19                 
46530400     0104		BIS	R18, R19, myint		      
43E00000     0108		SEXTL	myint, myint		      
47F4041D     010C		MOV	R20, FP			      
23DE0020     0110		LDA	SP, 32(SP)                    
6BFA8001     0114		RET	R26                           
                                                                      
Routine Size: 52 bytes,    Routine Base: $CODE$ + 00E4                
                                                                      
	     0118	BAKER::                                       
47FD0414     0118		MOV	FP, R20                       
47FB041D     011C		MOV	R27, FP                       
47F00401     0120		MOV	R16, buffer		      
A0010000     0124		LDL	myint, (R1)		      
47F4041D     0128		MOV	R20, FP			      
2FFE0000     012C		UNOP                                  
6BFA8001     0130		RET	R26                           
                                                                      
Routine Size: 28 bytes,    Routine Base: $CODE$ + 0118                
                                                                      
	     0134	CHARLIE::                                     
23DEFFE0     0134		LDA	SP, -32(SP)                   
47FA0415     0138		MOV	R26, R21                      
47FD0416     013C		MOV	FP, R22                       
47FB041D     0140		MOV	R27, FP                       
221E000E     0144		LDA	R16, 14(SP)		      
237DFFE8     0148		LDA	R27, -24(FP)                  
D35FFFF2     014C		BSR	R26, BAKER                    
47F6041D     0150		MOV	R22, FP			      
23DE0020     0154		LDA	SP, 32(SP)                    
6BF58001     0158		RET	R21

Routine Size: 40 bytes,    Routine Base: $CODE$ + 0134





More information about the Info-vax mailing list