[Info-vax] Compatibility and 64-bit (was: Re: OT: news from the trenches (re: Solaris))

Sat Mar 14 19:08:48 EDT 2015

On 2015-03-14 20:52:59 +0000, johnwallace4 at yahoo.co.uk said:

> You're aware of the differences in the code and the data structures 
> between a 32bit world and a 64bit world, fair enough. Obviously JR is 
> too.
> 
> But the rest of us may still get some enlightenment from JR's and your 
> explanations.

This is going to be slightly circuitous...

Here are some 64-bit discussions and examples 
<http://labs.hoffmanlabs.com/node/571> 
<http://labs.hoffmanlabs.com/node/787>, and there's a guide that's 
subsequently been integrated into other manuals 
<http://h71000.www7.hp.com/doc/72final/6467/6467pro.html> that 
specifically covers the topic.  Those and particularly the manual will 
provide a pretty good introduction.

In the specific case of descriptors, those data structures are 
upward-compatible, meaning that the same API can be coded to properly 
process either the older 32-bit descriptors or the 64-bit descriptors.

The programmers using BASIC, Fortran, COBOL and other descriptor-based 
languages — yes, I'm going to use that phrase again — can largely 
ignore the existence of descriptors used at the APIs and the subroutine 
calls, particularly within the language environment and — to a degree — 
to the LIBRTL calls.   The languages generally don't call attention to 
the details of descriptors, save for cases where the programmer might 
need to explicitly specify the argument-passing mechanism, of course.

In these cases, a future-facing approach would have the default 
compiler behavior switch to 64-bit for these, and force the existing 
folks to specify 32-bit descriptors or otherwise override the default.

Where descriptors tend to fall over include C, C++, Macro32 and similar 
pointer-based languages.  In these cases, programmers — myself included 
— have had a very nasty habit of accessing the descriptors directly, 
which means that older APIs do not support the longer descriptors, 
which is a problem for any arriving 64-bit descriptors.  C never saw 
much in the way of support for descriptors beyond some macros and 
constants and data structures, and particularly very little in the way 
of abstractions and tools, and not all that much C code uses the 
lib$analyze_sdesc call, and not all that much C code uses dynamic 
descriptors <http://labs.hoffmanlabs.com/node/273>, for instance.

Now descriptors aren't all that central to updating the C compiler to 
deal better with 64-bit native code, such as that MAYLOSEDATA3 case.  
Though it would be nice to have better abstractions for descriptors and 
such, and to have a path and a plan that would allow VSI and OpenVMS 
and the partners and customers that are actively changing and migrating 
their applications to get out of at least some of the current messes.  
Migrate to these macros and these APIs, and your code will survive when 
we rip out and replace the problem APIs, for instance.  (How long has 
it been that there's no wildcard sys$getuai, and where usernames are 
insanely still limited to a character set of up to 12 
alphanumerics-n-underscores, as Brian Schenkenberg[er], 
Jean-Fran[ç]oism and folks with a
"St." or an "O'" in their names, and others have long endured, and 
that's before considering 반기문 and other folks.  But I digress.)

As for the sorts of compromises that arise with designs within OpenVMS, 
the use of P2 and S2 space was both useful and expedient, and that also 
added some ugly.  Virtual address space is not flat in OpenVMS, there 
are two 30-bit P0 and P1 process ranges and a 64-bit (technically 61?) 
P2 process range, and two 30-bit S0 and S1 system ranges and a 64-bit 
(61?) S2 system range.   This ties back to being able to continue to 
run 32-bit and 64-bit code, and not forcing changes on the 32-bit code, 
while forcing gymnastics and changes and hackery into the 64-bit code.  
This worked around the process and address space limits of VAX, but at 
the expense of — for instance — not having honking big images, and 
having to recode applications to move stuff from S0 or S1 space out 
into S2 space rather than — as happened in some other cases — 
recompiling the code, and dealing with the cases where — as is now 
happening with the C code that was triggering the  MAYLOSEDATA3 case — 
you can't stuff 64-bits into an existing 32-bit data structure or 
existing 32-bit API, or you might just say hello to my little friend 
<http://labs.hoffmanlabs.com/node/800>.

Assuming the size of a long or an int is a longstanding portability 
issue in C code, too.   Assuming that a long is always a 32-bit value 
for instance, technically isn't correct C code.  More recent programs 
would likely use uint32_t or int32_t for this per C99, for instance.  
<http://en.wikipedia.org/wiki/C_data_types#stdint.h>  (Note the irony 
of using a sixteen year old C language standard and "recent".)  Even 
BASIC, Fortran and COBOL can see some of this mess, where the code 
assumes the size of some hypothetical API context argument is 32-bit, 
and now the 64-bit API needs to store a 64-bit.  (Hence those pesky 
sys$mumble64 calls, et al.)

Now for moving forward, I'd tend to look to C to be native 64 bit and 
preferably with most of C11, and to increasingly relegate parts of the 
OpenVMS environment and APIs to, well, legacy code, though those should 
be 64-bit APIs — those APIs are very likely stuck with the sys$mumble64 
calls for reasons of compatibility, but dropping back to 32-bit calls 
and norms just doesn't make sense for new C or C++ work, nor for 
substantial updates for existing work.  Go big, or go home.

As I've been mentioning for a while, there are always costs to 
compatibility, and there are always compromises in any operating 
system.  The team working on the operating system can optimize for 
short-term or long-term, for compatibility with old code or new, and 
for keeping APIs around even though they're known to be trouble.  There 
are many trade-offs here.   For the 64-bit work, there was a short-term 
and very effective trade-off made to the advantage of the existing 
source code, but this means that current and future programmers now 
have to special-case code that might need a whole lot such as migrate 
to the various sys$mumble64 calls, rather than just having 64-bit 
addresses and data natively, as is now common on other platforms.  
There's a cost to this compatibility, both in terms of what can be 
changed, and around how much effort might be needed to implement 
wrappers or jackets that allow the old code to continue to work.  Then 
there are the cases such as the SYSUAF morass, where code that directly 
accesses that will break should VSI decide to wade in and start fixing 
security weaknesses.  Then the question here then becomes, if the team 
working on the operating system is going to break code, do they break 
it minimally to get to their immediate goals — a cryptographically more 
secure hash — or does the team go big, and break more things and — for 
instance — integrate LDAP and the rest, and move everybody forward much 
more substantially?

My concerns are with the longstanding practice of making the new code 
the default path for adding the compatibility hackery and not adding 
that into the old path — mortgaging the future for better compatibility 
with the past — and with the cases where fixes and security updates and 
application improvements can be deferred because they can or will break 
source or run-time compatibility.

That a VAX-11/VMS V1.0 application executable would run unmodified on 
OpenVMS VAX V7.3 was a testament to the software compatibility 
engineering.  But is that the right trade-off in this era?   Make VMS 
into a premier operating system, and not something that's a steaming 
pile of hackery when trying something modern.  Folks with existing 
source code assuredly won't like that in the short-term, but they'll 
like having to do a port to another platform even less, and they'll — 
if the right stuff is chucked and replaced — come around to the 
improvements.

Is any of this going to be fun?  Easy?  Quick?  Source-compatible?  Not 
a chance.

But then VSI also has to pay the bills (and I and most others here in 
the comp.os.vms newsgroup are not going to buy a ton of OpenVMS 
licenses), and that might just mean piling on the classic source 
compatibility, and, well, continuing to make a dog's breakfast of any 
new code.

-- 
Pure Personal Opinion | HoffmanLabs LLC