[Info-vax] Tales From The Port
John Reagan
xyzzy1959 at gmail.com
Wed Apr 17 04:04:04 EDT 2019
On Tuesday, April 16, 2019 at 6:49:47 PM UTC+2, geze... at rlgsc.com wrote:
> On Tuesday, April 16, 2019 at 4:30:16 AM UTC-4, John Reagan wrote:
> > While we're waiting on Clair's next boot update, I'll share our latest Macro-32 bug so you can you some of the day-to-day bugs we're dealing with.
> >
> > [This is a cut-down bug from inside of DCL]
> >
> > For the following Macro-32 code,
> >
> > .psect $data$,rd,nowrt,noexe
> > one: .ascii /A/<0>
> > .ascii /Z/
> >
> > two: .ascii /A/
> > .byte 0
> > .ascii /Z/
> >
> > three: .ascii /A/
> > .blkb 1
> > .ascii /Z/
> >
> > .end
> >
> > The $DATA$ psect should be 9 bytes long(as seen on Itanium)
> >
> > SECTION DATA 4. (0004) SHDR$K_SHT_PROGBITS 0000000000000009 (9.) bytes
> > "$DATA$"
> >
> > 5A 00415A00 415A0041 A.ZA.ZA.Z
> >
> > but on x86, it is only 8 bytes. That <0> syntax was lost by the compiler.
> >
> > SECTION DATA 4. (0004) SHDR$K_SHT_PROGBITS 0000000000000008 (8.) bytes
> > "$DATA$"
> >
> > 5A00415A 00415A41 AZA.ZA.Z
> >
> >
> > The GEM interface lets the Macro compiler say "Put a A at $DATA+0, Put a Z at $DATA+2". The order isn't important either. The abstraction is that you are putting values into the PSECT and the unspecified holes end up being zeros. The LLVM interface is a "call it like you are writing out an assembly source file". The compiler has to buffer up the initializers and then call LLVM in the right order and with additional calls to insert/skip holes. The internal representation for the <0> surprised us and we didn't skip/initialize the byte.
> >
> > This malformed data (imagine if this was some sort of NUL-terminated string) made DCL unhappy and DCL is a pretty grumpy piece of code to start with.
> >
> >
> > John
>
> John,
>
> Ah, the joy of "the devil is in the details". I remember more than a few of those when converting from one interface to another. Sometimes, things are described in similar ways, but are in actuality quite different.
>
> I would think that items of this type are fairly common in more than a few sections of the MACRO-32 source, particularly within macros.
>
> I remember many occasions where I encountered code along the lines of:
>
> XYZ:
> .REPEAT 256
> .WORD 0
> .ENDR
>
> $$$ = .
> . = XYZ+nn
> .WORD ccc
> . = $$$
>
> The idea being to set up a table with default actions, and then fill in specific entries.
>
> Most system call macros on OpenVMS generate stack-based parameter lists, but I would not be surprised to find constructs like the above in macros that generate control blocks.
>
> Such "back patching" was a common coding trick in many assembler languages, not just within Digital.
>
> - Bob Gezelter, http://www.rlgsc.com
Yes, the "moving the dot" was something we implemented at the very beginning. That style is [ab]used extensively in the source code (with the SYSTEM_DATA_CELLS.MAR file being the winner of that category for those of you keeping score at home).
The macro intermediate representation for that <0> character is unusual since the parser broke the string into multiple pieces with the stuff before the NUL and the stuff after the NUL but in a different form than the explicit ".blkb 1" in the middle. It was how it bumped the location internally that tricked us. There was no test for it (other than trying to execute DCL)
And for those who like to read native assembly, here's what the bad output looks like from my example.
.section "$DATA$","a", at progbits
.type ONE, at object
ONE:
.byte 65
.byte 90
.type TWO, at object
TWO:
.byte 65
.byte 0
.byte 90
.type THREE, at object
THREE:
.byte 65
.zero 1
.byte 90
More information about the Info-vax
mailing list