[Info-vax] Emulation Performance
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Thu Jan 17 16:58:17 EST 2013
On 2013-01-17 21:11:36 +0000, David Froble said:
> First understand, I know nothing ....
>
> Consider (as far as I know) what an emulator does. We throw around the
> phrase "emulate an Alpha" or whatever. But is it really emulating an
> Alpha? Now I'm guessing, and would enjoy reading any corrections, that
> the hardware is NOT emulated. What's happening is that the
> instructions are emulated to give the same result, from the
> instructions, that the hardware would give. But what about Out of
> Order, pipelines, and some of the rather esoteric stuff done in CPUs.
> It's my guess none of this is present, so the performance from such
> features is lost even before getting into the overhead of emulating the
> instructions.
It's up to the authors of the emulators as to the accuracy of the
emulation they wish to or need to implement.
For OpenVMS, so long as the emulator complies sufficiently closely to
the Alpha system reference manual (SRM), and the memory and I/O and
related all look more or less as expected for the target Alpha
system(s) being emulated, then OpenVMS will run on it. Getting there
is no small effort, though.
As for performance, yes, emulators are slow.
Anybody writing an emulator will be optimizing the snot out of the
instruction decoder, but you're still looking at a bunch of host
instructions that will be executed for each Alpha (or VAX or...)
instruction that gets decoded, plus however much code is needed to
"execute" the instruction.[1]
More fully emulating — or more correctly emulating or whatever you want
to call it — a superscalar[2] out-of-order design would require
multiple cores and some sharing among the cores, or maybe something
akin to Itanium's instruction bundles and predication. I'd expect
emulating out-of-order and superscalar would be less of a benefit than
an effective JIT
<http://www.research.ibm.com/trl/projects/jit/index_e.htm>, and a whole
lot of effort to implement. With a JIT, getting the code processed
with fewer instructions than an instruction decoder would require is a
win, but detecting the hot spots and warming up the JIT adds overhead.
TANSTAAFL.
An instruction decoder is somewhat analogous to a BASIC interpreter,
where a JIT is somewhat closer to a BASIC compiler.
Once you get to native instructions and native JIT'd code, then you can
use the processor-level optimizations that are available in the host
system hardware.
Related discussions include the JVM, and approaches such as Apple's Rosetta[3].
————
[1] Such as the VAX CRC instruction
<http://h71000.www7.hp.com/doc/73final/4515/4515pro_026.html#16_cyclicredundancycheckinstruc>,
if you're emulating a VAX.
[2] <http://en.wikipedia.org/wiki/Superscalar>, which includes a
picture of a Cray Alpha board.
[3] Rosetta was created by
<http://thenextweb.com/insider/2011/10/22/how-one-of-apples-most-important-pieces-of-software-came-from-a-small-uk-startup/>.
IBM acquired Transitive back in 2008.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list