[Info-vax] Compatibility and 64-bit (was: Re: OT: news from the trenches (re: Solaris))
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Sat Mar 14 19:08:48 EDT 2015
On 2015-03-14 20:52:59 +0000, johnwallace4 at yahoo.co.uk said:
> You're aware of the differences in the code and the data structures
> between a 32bit world and a 64bit world, fair enough. Obviously JR is
> too.
>
> But the rest of us may still get some enlightenment from JR's and your
> explanations.
This is going to be slightly circuitous...
Here are some 64-bit discussions and examples
<http://labs.hoffmanlabs.com/node/571>
<http://labs.hoffmanlabs.com/node/787>, and there's a guide that's
subsequently been integrated into other manuals
<http://h71000.www7.hp.com/doc/72final/6467/6467pro.html> that
specifically covers the topic. Those and particularly the manual will
provide a pretty good introduction.
In the specific case of descriptors, those data structures are
upward-compatible, meaning that the same API can be coded to properly
process either the older 32-bit descriptors or the 64-bit descriptors.
The programmers using BASIC, Fortran, COBOL and other descriptor-based
languages — yes, I'm going to use that phrase again — can largely
ignore the existence of descriptors used at the APIs and the subroutine
calls, particularly within the language environment and — to a degree —
to the LIBRTL calls. The languages generally don't call attention to
the details of descriptors, save for cases where the programmer might
need to explicitly specify the argument-passing mechanism, of course.
In these cases, a future-facing approach would have the default
compiler behavior switch to 64-bit for these, and force the existing
folks to specify 32-bit descriptors or otherwise override the default.
Where descriptors tend to fall over include C, C++, Macro32 and similar
pointer-based languages. In these cases, programmers — myself included
— have had a very nasty habit of accessing the descriptors directly,
which means that older APIs do not support the longer descriptors,
which is a problem for any arriving 64-bit descriptors. C never saw
much in the way of support for descriptors beyond some macros and
constants and data structures, and particularly very little in the way
of abstractions and tools, and not all that much C code uses the
lib$analyze_sdesc call, and not all that much C code uses dynamic
descriptors <http://labs.hoffmanlabs.com/node/273>, for instance.
Now descriptors aren't all that central to updating the C compiler to
deal better with 64-bit native code, such as that MAYLOSEDATA3 case.
Though it would be nice to have better abstractions for descriptors and
such, and to have a path and a plan that would allow VSI and OpenVMS
and the partners and customers that are actively changing and migrating
their applications to get out of at least some of the current messes.
Migrate to these macros and these APIs, and your code will survive when
we rip out and replace the problem APIs, for instance. (How long has
it been that there's no wildcard sys$getuai, and where usernames are
insanely still limited to a character set of up to 12
alphanumerics-n-underscores, as Brian Schenkenberg[er],
Jean-Fran[ç]oism and folks with a
"St." or an "O'" in their names, and others have long endured, and
that's before considering 반기문 and other folks. But I digress.)
As for the sorts of compromises that arise with designs within OpenVMS,
the use of P2 and S2 space was both useful and expedient, and that also
added some ugly. Virtual address space is not flat in OpenVMS, there
are two 30-bit P0 and P1 process ranges and a 64-bit (technically 61?)
P2 process range, and two 30-bit S0 and S1 system ranges and a 64-bit
(61?) S2 system range. This ties back to being able to continue to
run 32-bit and 64-bit code, and not forcing changes on the 32-bit code,
while forcing gymnastics and changes and hackery into the 64-bit code.
This worked around the process and address space limits of VAX, but at
the expense of — for instance — not having honking big images, and
having to recode applications to move stuff from S0 or S1 space out
into S2 space rather than — as happened in some other cases —
recompiling the code, and dealing with the cases where — as is now
happening with the C code that was triggering the MAYLOSEDATA3 case —
you can't stuff 64-bits into an existing 32-bit data structure or
existing 32-bit API, or you might just say hello to my little friend
<http://labs.hoffmanlabs.com/node/800>.
Assuming the size of a long or an int is a longstanding portability
issue in C code, too. Assuming that a long is always a 32-bit value
for instance, technically isn't correct C code. More recent programs
would likely use uint32_t or int32_t for this per C99, for instance.
<http://en.wikipedia.org/wiki/C_data_types#stdint.h> (Note the irony
of using a sixteen year old C language standard and "recent".) Even
BASIC, Fortran and COBOL can see some of this mess, where the code
assumes the size of some hypothetical API context argument is 32-bit,
and now the 64-bit API needs to store a 64-bit. (Hence those pesky
sys$mumble64 calls, et al.)
Now for moving forward, I'd tend to look to C to be native 64 bit and
preferably with most of C11, and to increasingly relegate parts of the
OpenVMS environment and APIs to, well, legacy code, though those should
be 64-bit APIs — those APIs are very likely stuck with the sys$mumble64
calls for reasons of compatibility, but dropping back to 32-bit calls
and norms just doesn't make sense for new C or C++ work, nor for
substantial updates for existing work. Go big, or go home.
As I've been mentioning for a while, there are always costs to
compatibility, and there are always compromises in any operating
system. The team working on the operating system can optimize for
short-term or long-term, for compatibility with old code or new, and
for keeping APIs around even though they're known to be trouble. There
are many trade-offs here. For the 64-bit work, there was a short-term
and very effective trade-off made to the advantage of the existing
source code, but this means that current and future programmers now
have to special-case code that might need a whole lot such as migrate
to the various sys$mumble64 calls, rather than just having 64-bit
addresses and data natively, as is now common on other platforms.
There's a cost to this compatibility, both in terms of what can be
changed, and around how much effort might be needed to implement
wrappers or jackets that allow the old code to continue to work. Then
there are the cases such as the SYSUAF morass, where code that directly
accesses that will break should VSI decide to wade in and start fixing
security weaknesses. Then the question here then becomes, if the team
working on the operating system is going to break code, do they break
it minimally to get to their immediate goals — a cryptographically more
secure hash — or does the team go big, and break more things and — for
instance — integrate LDAP and the rest, and move everybody forward much
more substantially?
My concerns are with the longstanding practice of making the new code
the default path for adding the compatibility hackery and not adding
that into the old path — mortgaging the future for better compatibility
with the past — and with the cases where fixes and security updates and
application improvements can be deferred because they can or will break
source or run-time compatibility.
That a VAX-11/VMS V1.0 application executable would run unmodified on
OpenVMS VAX V7.3 was a testament to the software compatibility
engineering. But is that the right trade-off in this era? Make VMS
into a premier operating system, and not something that's a steaming
pile of hackery when trying something modern. Folks with existing
source code assuredly won't like that in the short-term, but they'll
like having to do a port to another platform even less, and they'll —
if the right stuff is chucked and replaced — come around to the
improvements.
Is any of this going to be fun? Easy? Quick? Source-compatible? Not
a chance.
But then VSI also has to pay the bills (and I and most others here in
the comp.os.vms newsgroup are not going to buy a ton of OpenVMS
licenses), and that might just mean piling on the classic source
compatibility, and, well, continuing to make a dog's breakfast of any
new code.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list