[Info-vax] Reloading device drivers on x86-64 VMS

Sun Mar 8 09:17:23 EDT 2015

On 2015-03-08 11:37:28 +0000, Simon Clubley said:

> On 2015-03-07, Bob Gezelter <gezelter at rlgsc.com> wrote:
>> 
>> I agree with Simon and Paul, with the limitation that the capability is not
>> generic or guaranteed, as noted by Hoff.
>> 
>> In doing extensive driver-level work on RSX, the ability to quiesce and
>> reload a driver without a reboot was vital to achieving progress. It always had
>> the limitation that you needed a quiescent device, the data structures were
>> compatible, and there was always the degree of risk that a crash could ensue.
>> That said, it was extremely helpful at speeding progress.
>> 
> 
> On Linux it's slightly different.
> ...

The initial work in the port is inherently going to involve drivers 
that are rather sketchy, and quite possibly initially depending on some 
VM-based drivers to avoid dealing with hardware.  The initial work in 
the port also has the problem that the environment — beyond the generic 
limitations around the boot-time driver environment — is very limited.  
Put another way, there's a whole lot of scaffolding inherent in the 
port, so I wouldn't expect any great effort to add reloadable drivers 
within the context of the initial port.

Further along in the porting sequence and once the core device drivers 
are more-or-less working, forward progress — even with crashes — can 
involve multiple boxes and multiple VM guests.   Earlier OpenVMS ports 
didn't have all that much hardware available and for various good 
reasons, and the ports used emulation (Alpha) or what were then 
third-party boxes (Itanium), and cross-tools to allow a whole lot of 
the work to be conducted in a more stable environment.   Did I mention 
a whole lot of scaffolding?  In this more recent era, x86-64 boxes are 
much cheaper and far more ubiquitous, and there are widely available 
and cheap virtual machines available for and stable on the target 
architecture.  Put another way, having a dozen "boxes" on your desktop 
is No Big Deal in this era.   What does this mean for the port?  You 
can run a whole herd of test servers.  Crash one, move on to the next 
server in the pool while the earlier boxes are rebooting.

But yes, it'd be very nice to have the ability to unload and reload drivers.

The differences in the current environment also tie back into some 
weaknesses and omissions in OpenVMS.   With OpenVMS, you actually have 
to name the box when configuring the networking, and that's something 
you'd expect with modern desktops and laptops.   But not with modern 
servers, as those will usually name themselves — something VMS doesn't 
do — and acquire a DHCP address — something that VMS isn't all that 
good at, or be named and addressed en-mass using remote management 
software.   Yes, the OpenVMS distros can half-bake a DHCP configuration 
and connect, but the standard install (and which is preferably the same 
as the FIS kit, as mentioned earlier) and the rest could really use the 
capability to install and boot and allow a remote administrator or 
remote software to finish the install.   OpenVMS just isn't that good 
at being part of a herd, either; of being an 
individually-inconsequential box that can be reloaded and restarted 
either individually or en-mass.  No support for ZeroConf/Bonjour or 
what HP is using for their Helion[1] / OpenStack Neutron / OpenFlow 
management.

Having the ability to unload and reload drivers presages the ability to 
dynamically change other parts of the kernel, too.  Which leads to 
discussions of — yes, and "on Linux it's slightly different" here, too 
— Ksplice-style hot-patching; to the ability to upgrade at least some 
executive components without rebooting.  That'd remove some of my 
grumbling about uptime being a measure of server insecurity and risk, 
too.  But again, with a gazillion cheap boxes and applications — 
OpenVMS clusters — that can and should operate across multiple boxes, 
rebooting is easy and only getting easier, so is unloading and 
reloading drivers or execlets something that customers can and will 
have?  Are we solving last-decade problems when OpenVMS really needs to 
look at next-decade requirements?

This is not to imply that I wouldn't like these unload and hot-patch 
capabilities, but with hopefully-sane license prices and hopefully-sane 
cluster license prices, having herds of OpenVMS boxes is feasible, can 
and should get easier to deal with, and this then leads to some very 
different application designs going forward, and existing applications 
that (slowly) migrate into these newer computing environments.  How 
many OpenVMS project proposals got nixed over the server prices and 
particularly the cluster and HBVS and MCOE license prices, and ended up 
half-baking some compromises into their design or their 
disaster-tolerance and recovery plans, or punting on OpenVMS entirely 
and using Linux, after all?

Remember too that aiming future OS work at what we have now is a 
strategy that will generally fail for VSI.  They have to aim for what 
are the then-current features and capabilities when they can deliver 
the changes.  Not what is available now.  the whole computing market 
moves forward, and engineering any new hardware or software product 
release takes time, even if VSI moves toward continuous deployment[2] 
to get the changes out to their customers more quickly.  VSI certainly 
knows this — though competitive operating system development is the 
sort of project where you can definitely plow through a near-unlimited 
budget, even presuming you could even manage and test all of what a 
large budget would allow...  In short, how different will computing and 
servers be in ~2018, or whenever it is that we see the x86-64 port?  
Mobile computing completely exploded in ~six years, and Microsoft went 
from being centrally relevant to client computing, to being a small 
percentage of all client devices, after all.

————
[1] The HP SDS and SDN efforts are just the very tip of software-based 
configuration and management 
<http://h30507.www3.hp.com/t5/Storage-Insiders/Is-software-defined-storage-right-for-you/ba-p/169516> 
<http://h17007.www1.hp.com/us/en/networking/solutions/technology/sdn/index.aspx> 

[2] Semi-unrelated: HP continuous deployment 
<http://blog.matthewskelton.net/2015/03/06/hp-is-trying-to-patent-continuous-delivery-here-is-how-you-can-help-block-this-madness/> 

-- 
Pure Personal Opinion | HoffmanLabs LLC