[Info-vax] VAX Macro to C conversion

Sat Jul 13 00:30:10 EDT 2019

On Monday, June 24, 2019 at 1:48:07 AM UTC-4, Andrew Shaw wrote:

> 
> > I suspect John Reagan will be by with more insightful (and likely inciteful)
> > questions.
> 
> Bring it on JR !

You rang?  [If you mention my name three times, I have to appear.]

I think you are asking two different questions:

#1) I'm looking for a conversion tool due to possible performance issues and programmer knowledge issues, have you heard of XTRAN?

#2) Does using VAX Macro-32 limit performance versus coding in C (or any other language)

#1) Conversion and XTRAN

Yes, I know XTRAN and Stephen Heffner.  The compiler group talked with Stephen years about about a tool to convert stuff to C.  I think the interest at the time was mainly with BLISS to C, but we also discussed Macro-32 to C and Pascal to C.  XTRAN at the time only had limited success on the examples we provided.  For Macro-32, the problem areas were routines that jumped into each other and trying to come up with some reasonable symbolization.  For cross-jumpers, the generated C had routines with argument lists of (R0, R1, R2, R3, etc.).  For naming of register-based variables, it was tricky some the code might have used R0 for a "length", then a "pointer", then a "length" again.  In a well-written C program <insert joke here>, you would use three variables and let the compiler optimize it.  I'm sure that XTRAN has matured in since then so you should confirm current behavior with them.

You'll have to decide whether you have well-written Macro-32 code (XTRAN tries to propagate comments) and you can spend the time to "clean up" the generated C code OR if you have horribly commented Macro-32 code with an obscure algorithm and you can spend lots more time to figure it all out and work in the C code.  

#2) What does Macro-32 cost me?

For straight line code, the Macro compilers (AMACRO, IMACRO, and XMACRO) all generate reasonable dense code.  We even spend effort to leverage the x86-64 condition code flags and indexed addressing modes.

On Alpha and Itanium, the compiler leverages GEM's instruction peepholer and scheduler.  Given the internal design of LLVM at the "assembler interface", we are past LLVM's scheduler.  However, I'd claim that with the internal design of current x86-64 chips has huge out of order windows (Ice Lake can have 352 in flight with 128 loads in flight and 72 stores in flight) that instruction level ordering isn't as important anymore.  The chips seem to do it (ignoring all the side-channel Spectre-like data leakage) for us.  There 

On x86, we'll also be using memory locations to hold the Alpha register state but fast L1/L2 cache comes to the rescue.  Even if you rewrote into C, LLVM will end up with some variables on the stack.  Heck, in 32-bit mode, you only have 8 registers available (and some have predefined behavior) so 32-bit Linux and 32-bit Windows systems have been relying on fast cache for stack-based variables for ever.

What you really don't get for Macro-32 is higher level optimizations such as loop unrolling, hoisting expressions out of loops (Macro will attempt to hoist some address computations), routine inline expansion (Itanium has big call overhead, Alpha and x86 a small call overhead).  Plus Macro is horrible if you are using floating point that originated on VAX.  Modern clang/LLVM can do a good job at using parallel vector instructions where as XMACRO will not.

So we think that Macro-32 on x86-64 will be just as "reasonable" as it was in the past.  There will be added memory loads/stores to manage the Alpha register vector (we are looking at some techniques to save repeated loads but deferring stores has implications for unwind and debug information) but looking back, it could have been worse. :)