[Info-vax] Desirable features for VMS

Tue Jan 30 01:50:02 EST 2024

On 1/29/2024 7:00 PM, Arne Vajhøj wrote:
> On 1/29/2024 6:21 PM, Hans Bachner wrote:
>> Marc Van Dyck schrieb am 29.01.2024 um 16:36:
>>> Dave Froble formulated the question :
>>>> I'd just note that the OSs would be included as applications, so re-starting
>>>> them from where they were interrupted would be included in the concept.  So,
>>>> yeah, the monitor would be outside/over the OSs. Perhaps something like
>>>> happens with VMs.  Except VMs want to move the activity to another system,
>>>> not recover on the same system.
>>>
>>> Whatever the design and implementation, this would be a really useful
>>> and marketable addition to the OpenVMS cluster concept. Clusters were
>>> invented 40 years ago to implement horizontal scalability, because
>>> vertical scalability was impossible, technically or financially. This
>>> issue has mostly disappeared today, current hardware being able to
>>> deliver any power we might want. Today's clusters are essentially
>>> put in place for redundancy or disaster recovery purposes ; the next
>>> logical step should be to provide this redundancy in a transparent way
>>> to the system user.
>>>
>>> This should also be, as opposed to simple user niceties, something that
>>> allows VSi to make money with.
>>
>> Would OpenVMS Service Control cover your needs?
>>
>> <https://vmssoftware.com/products/service-control/>
>>
>> Service Control was originally developed by Wolfgang Burger at HP in Vienna
>> and later adopted by VSI. As far as I know it is (still) offered as a service,
>> not a product - but only VSI can tell.
>
> This indeed seems like the app<--->VMS equivalent of
> VM<--->ESXi.
>
> You define an app to be running on one node in the cluster
> and if something happens then the software start the app
> on another node.
>
> Like you define a VM to be running on one ESXi server in the
> cluster and if something happens then VMWare spin up
> the VM on another ESXi server.
>
> Arne

Well, there are apps, and then there are other apps ...

Ok, a web server handling connection requests.  Perhaps one or more connections 
are disrupted before finishing.  A re-start will begin to again handle 
connection requests.  Perhaps reasonable.

Then, an example from one of my old customers:

Orders were build interactively, and the data was stored in an intermediate 
file.  When done building, the intermediate file is then queued to a poster that 
processes the data and performs updates to all pertinent database files, then 
deletes the intermediate file.

Ok, what happens when the system crashes during processing of an order?  Things 
are left incomplete and a nasty mess.  Re-starting the poster will make things 
worse.  So, just restarting is not such a good idea.

In the example, best not to process the order that was interrupted.  Thankfully, 
this almost never happened.  Thank you VMS and DEC hardware and battery backup 
UPS.  But, it was still a possibility.

The partial solution was to build checkpoints into the design.  At each specific 
point in the poster, a flag was set, and forced to disk, as each file update 
occurred.  The poster was set up to respect the checkpoint flags.  Worked sort 
of well.  Thee was still the possibility the checkpoint flags weren't written to 
disk.  I didn't have an app that reviewed the information, and automatically 
re-queued it telling the poster where to re-start.  That was a tedious manual task.

Hey, with most things, there is a point of diminishing returns on efforts.  Just 
not worth the cost.

Please don't start ranting about a database with 2 stage commits.  Didn't have one.

But my point is, just re-starting an application isn't always a solution.

-- 
David Froble                       Tel: 724-529-0450
Dave Froble Enterprises, Inc.      E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA  15486