[Info-vax] VMS x86 performance ?

Fri Oct 30 17:26:13 EDT 2020

On Friday, October 30, 2020 at 4:43:49 PM UTC-4, IanD wrote:
> On Saturday, October 31, 2020 at 12:19:06 AM UTC+11, Michael Moroney wrote: 
> 
> <snip>
> > 
> > Still too early to tell. The compilers have optimization off and frankly, the 
> > generated code is hideous. 
> > 
> > Despite that, boots are wicked fast.
> Fair enough in one aspect but I wasn't interested in how optimised or not things are but more interested in a ballpark approximate indication of how things look so far 
> 
> The fact that it boots wickedly fast is also an indication is it not? 
> 
> VSI have stated they are constantly running automated tests, more than ever before apparently, these will have metrics associated, I highly doubt detailed performance data is not part of this capture and I suspect that they would also have actual workloads being pushed onto those test systems to thrash out issues that crop up over larger and longer running workloads 
> 
> Data is being collected, ideas formed and inferences made, they are not running blind here, not at this later stage of the port 
> 
> It's in EAK form already, shipping it to people to kick the tyres and what I indirectly hearing is that they could be doing so with zero idea of how it will perform? I find that rather hard to believe 
> 
> VSI engineering is great but their marketing leaves a lot to be desired and I think a fair chunk of the success or failure of VMS in the future will be tied to marketing. History is littered with examples of the superior product failing against the more marketed. VMS has captive customers but only for so long 
> 
> If there is any perform gains known about, no matter how rough, I think VSI would be smart to use this especially in places like this, as a form of indirect marketing/advertising teaser to VMS customers that the 6+ year wait is worth it 
> 
> The converse to this would of course be if performance is bad or pathetic...

To recap from the question when asked before:

- Most of us at VSI are doing our work on virtual machines.  I run VirtualBox on my W10 machine here at home with a Skylake, 16GB of memory, and an SSD system disk.  It is homebuilt system with a GIGABYTE system board, etc.   It is busy doing other things (browsers, email, podcasts, etc.) What should I compare that with?

- Virtual I/O to the container files adds yet another level of unknown overhead besides RMS/XQP.    How does VBox do I/O?  Beats me.  But then again, most of my tests (so far) are not I/O intensive.

- If you want to take a SWAG at how much the LLVM optimizer will give you, hop over to your Linux box, find your favorite program, get a recent clang compiler, and do a -O0 benchmark and then an -Ofast benchmark.    Get that difference, spin around in circles, touch your nose, and then guess from that.   As Michael mentioned, the code (especially from the Macro compiler dealing with Alpha registers, condition codes which aren't exactly the same as x86) can be quite ugly.  Everything are stack temporaries.  Very little has been hoisted into registers (you don't have that many preserved registers anyway with x86). 

- The reason we didn't put the optimizer into the cross-compilers is three fold.
-- I literally couldn't get the LLVM 3.4.2 optimizer to compile with the Itanium C++ compiler.  LLVM 3.4.2 claims to be buildable with a C++03 compiler.  My gut says that the while the Itanium C++ compiler claims C++03, it has bugs.   The LLVM code base LOVES templates and template specialization.  That's where all my errors were.  [The same reason I couldn't build clang 3.4.2]
-- Since everybody is single stepping through code with XDELTA and DELTA, working from MAP files, machine code output from ANALYZE/OBJECT/DISA, and doing hex arithmetic, optimizing it would have made the debugging experience even worse than it is.  Eh?
-- The LLVM optimizer relies on metadata attached to the LLVM IR.  It is how a language frontend conveys things like pointer aliasing, parameter aliasing, to the optimizer.  The mechanism used by GEM is radically different and actually requires real effort to map the concepts.  The LLVM metadata has changed formats between LLVM 3.4.2 and recent LLVMs (currently at 10.0.1).  With the above issues, didn't seem to make much sense to write the code to convert the GEM model to the LLVM 3.4.2 model just for us to recycle it for LLVM 10.  It might have been good practice, but we had other things to pound on.  I only have so many rocks.  I keep one for me to pound my head against from time to time.

- The optimizer will show up with the native compilers but probably not even with the first native compilers as we have to work on the metadata converter as I mentioned above.   The goal will be to match the level of detail as the clang frontend.  Anything less and you limit the LLVM optimizer.