[Info-vax] Dave Cutler, Prism, DEC, Microsoft, etc.

toby toby at telegraphics.com.au
Wed Dec 16 19:36:06 EST 2009


On Nov 30, 4:12 pm, JF Mezei <jfmezei.spam... at vaxination.ca> wrote:
> The Compiler guy was heard typing:
>
> >> Remember that except for compiler bugs, alignment faults comes from the user
> >> lying to the compiler.
>
> Excuse me, but are there really circumstances where the programmer
> ("user" from Mr Reagan point of view) is lying to the compiler ?
>
> if I have a 100 byte record, and bytes 7 though 10 inclusively contain
> an integer, and I do:
>
> myint = (int *) &(buffer + 6) ; /* hopefully I have the syntax right*/

myint = * (int*) &(buffer + 6);
        ^ you mean to dereference the cast pointer.

The "intention" is clear but the code remains non-portable, not just
because of potential misalignment, but also risk of endianness
mismatch if the buffer were filled by a host with a different ordering
(network packet or file, for example).

> In what way am I lying ?  Does the compiler realise that this will be an
> unaligned access or will it be accusing me of lying ?

There are an infinite number of ways to contrive code that will crash
at runtime (perhaps "abusing" is a better word than "lying"). I
believe gcc's strict aliasing analysis will warn about some of these
situations.

> ...
> In C, how can I tell a compiler that trying to move 4 bytes into an
> integer variable MAY cause an alignment fault ? With variable record
> structures for instance, the location of an integer may vary and
> sometimes be un an uneven offset from start of buffer. But you won't
> know that at compile time since it would vary from record to record as
> you process a file for instance.

You need to write safe, portable code accordingly. You cannot expect
the compiler to convert an integer dereference into an equivalent
sequence of unaligned byte references. (In this respect C is not
higher level than assembly.)

The compiler does guarantee that *struct and union members* will have
correct alignment for the target architecture (though padding/
alignment can often be modified by a compiler option for software ABI
compatibility). So there is a difference in the way you need handle
"foreign" data (with a format defined outside the compilation world)
and data structures defined purely within compilation units. This
means that, following your example,

    struct is_safe {
        char foo[6];
        int bar; // always properly aligned but is not necessarily at
offset = 6
    };

In the case of:
    struct foreign {
        unsigned char buffer[100];
    };
the 32-byte integer at offset 6 is best read, imho, by open coded
indexing, shift and OR, depending on its endianness (which you must of
course know).

>
> BTW, does the X86-64 have similar performance handicap when trying to
> fetch an integer from unaligned memory location ?




More information about the Info-vax mailing list