[Info-vax] Future comparison of optimized VSI x86 compilers vs Linux compilers
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Mon Aug 3 16:07:38 EDT 2020
On 2020-08-03 18:34:06 +0000, onewingedshark at gmail.com said:
> Here's a question: how hard would it be to simply [re]write the
> code-generator for the GEM-compilers to do VMS x86? Yes, I realize
> there's a lot of hype around LLVM, and it would be nice to get the
> optimizations, but this should be weighted against what you have now.
Here ignoring and effectively abandoning the existing commitment to and
the very substantial work underway and that already completed on LLVM
at VSI...
To what benefit? Re-creating an actively-maintained and widely-used and
very functional code generator and the rest of the compiler
infrastructure is a substantial cost (and that even if you're starting
with a sort-of-working but sort-of-stale x86-32 implementation), and
involves re-writing other LLVM-associated tooling past the code
generation and optimization, and the benefits to VSI of that
substantial effort in GEM-based compiler tooling then only accrue when
the results are sufficiently better than LLVM to matter. And the costs
of keeping up that tooling and keeping it performing competitively are
ongoing.
LLVM also gets you an Arm back-end, meaning that some hypothetical
future port of VSI OpenVMS to Arm AArch64 servers just got somewhat
easier.
And LLVM gets VSI access to Clang, a current C and C++ compiler, and
with a pile of other capabilities. No OpenVMS-isms in Clang or Flang or
such of course (yet?), but then these and others are also much newer
compilers than those available on OpenVMS.
And LLVM is modular, meaning hunks of the tooling can be re-used and
integrated into other packages and tools. The compiler can be directly
implemented within an IDE, for instance, allowing the IDE much better
insight into the language syntax, and into code completion for the
developer, and means that coding errors can be dynamically displayed
directly within the IDE as the source code is entered. This continuous
compilation is significantly past what the LSEDIT COMPILE/REVIEW
mechanism and related can offer, that as one of the closest examples
available on OpenVMS. And there are other benefits.
And there are multiple sorta-different-GEM implementations around, just
to keep this start-over-again-with-GEM idea that much more
"interesting". Which one?
> Also, since LLVM is/was Low Level Virtual Machine, is it possible to
> rewrite the code-generators to that they target said VM directly?
> (There's apparently little relation to the VM from the IR side of
> things now, according to wikipedia; and I don't really follow LLVM, so
> I'm not sure if it's applicable here.)
LLVM is a compiler infrastructure, of which includes code generation
among other features, and of which does not include a hypervisor.
Quoth WP: "The LLVM compiler infrastructure project is a set of
compiler and toolchain technologies, which can be used to develop a
front end for any programming language and a back end for any
instruction set architecture. LLVM is designed around a
language-independent intermediate representation (IR) that serves as a
portable, high-level assembly language that can be optimized with a
variety of transformations over multiple passes.
LLVM is written in C++ and is designed for compile-time, link-time,
run-time, and "idle-time" optimization. Originally implemented for C
and C++, the language-agnostic design of LLVM has since spawned a wide
variety of front ends: languages with compilers that use LLVM include
ActionScript, Ada, C#, Common Lisp, Crystal, CUDA, D, Delphi, Dylan,
Fortran, Graphical G Programming Language, Halide, Haskell, Java
bytecode, Julia, Kotlin, Lua, Objective-C, OpenGL Shading Language,
PostgreSQL's SQL and PLpgSQL, Ruby, Rust, Scala, Swift, and Xojo."
Again, the "virtual machine" here is in reference to the intermediate
(and portable) representation that is then used to abstract the
processing across a variety of different back-end code generators. Not
to a hypervisor.
> Lastly, what about going with an translation-IR route:
> - Have a GEM backend that produces IR 'objects'.
> - Have these 'objects' with a 'compile' method that produces something
> appropriately low-level; say BLISS.
> - Update BLISS to be on x86.
> - Take the BLISS-output from the IR, compile for x86.
> - Done.
VSI is using a shim—called the GEM-to-LLVM converter, or some such—to
glue together the GEM-expecting legacy compiler front-ends with the
LLVM infrastructure and code generation.
Exactly nobody is going to want to use Bliss (⁉️) as an intermediate
language. That has various issues, not the least of which are the
adverse effects of source code debugging. Not everybody wants to debug
Bliss and the whatever-to-Bliss translation and optimization, nor debug
the Bliss-to-x86-64 optimization for that matter.
Re-implementing the LLVM IR using Bliss will end badly.
> PS -- Is there any chance for the GEM compilers to be released to
> open-source, or the documentation/interface(s) released so that the
> hobbyists could try implementing a direct GEM-to-x86 backend?
GEM source code release? No. For neither the first nor the last time
this will be discussed here, VSI indicates they have not acquired from
HPE the rights to open-source the HPE source code. That includes GEM.
VSI can hypothetically release their own source code, such as changes
to LLVM. If VSI chooses. But GEM is not happening without permission
from HPE.
Learning more about the topic? For x86-64 and Arm and some other
platforms, LLVM is a good starting point for writing a compiler, too.
https://github.com/banach-space/llvm-tutor
https://github.com/ghaiklor/llvm-kaleidoscope
etc...
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list