[Info-vax] Desirable features for VMS
Dave Froble
davef at tsoft-inc.com
Tue Jan 30 23:25:20 EST 2024
On 1/30/2024 7:05 PM, Arne Vajhøj wrote:
> On 1/30/2024 5:20 AM, Marc Van Dyck wrote:
>> Dave Froble wrote on 30/01/2024 :
>>> Ok, a web server handling connection requests. Perhaps one or more
>>> connections are disrupted before finishing. A re-start will begin to again
>>> handle connection requests. Perhaps reasonable.
>>>
>>> Then, an example from one of my old customers:
>>>
>>> Orders were build interactively, and the data was stored in an intermediate
>>> file. When done building, the intermediate file is then queued to a poster
>>> that processes the data and performs updates to all pertinent database files,
>>> then deletes the intermediate file.
>>>
>>> Ok, what happens when the system crashes during processing of an order?
>>> Things are left incomplete and a nasty mess. Re-starting the poster will
>>> make things worse. So, just restarting is not such a good idea.
>>>
>>> In the example, best not to process the order that was interrupted.
>>> Thankfully, this almost never happened. Thank you VMS and DEC hardware and
>>> battery backup UPS. But, it was still a possibility.
>>>
>>> The partial solution was to build checkpoints into the design. At each
>>> specific point in the poster, a flag was set, and forced to disk, as each
>>> file update occurred. The poster was set up to respect the checkpoint flags.
>>> Worked sort of well. Thee was still the possibility the checkpoint flags
>>> weren't written to disk. I didn't have an app that reviewed the information,
>>> and automatically re-queued it telling the poster where to re-start. That
>>> was a tedious manual task.
>>>
>>> Hey, with most things, there is a point of diminishing returns on efforts.
>>> Just not worth the cost.
>>>
>>> Please don't start ranting about a database with 2 stage commits. Didn't
>>> have one.
>>>
>>> But my point is, just re-starting an application isn't always a solution.
>>
>> No, just restarting isn't the solution. And engineering the application
>> to support random restarts isn't either. Just select a process from a
>> system window, drag and drop it in another system window, and it
>> continues to run on the other system as if nothing happened. That's what
>> I'm after...
>
> There are different models for HA:
> A) application managed - the application store state somewhere where
> a new instance can pick it up - this is not that hard to implement
> but the application need to be written for it
> B) system managed - the system store state somewhere where
> a new instance can pick it up - this is hard to implement
> but the application doesn't need to be written for it
> A2) same as A with a feature where the system can move the application
> from one node to another node - don't schedule the
> processes/threads, copy memory content to other node, get various
> files/network connections opened on the other node, schedule the
> processes/threads on the new node, kill the instance on the
> old node - harder than A but easier than B
"B" is what I had in mind in my earlier post.
But, what will the customers pay for?
In the example I posted earlier, I didn't get paid for the checkpointing work.
Customer didn't care about little glitches. It offended my sensibilities
concerning "right and wrong". After I thought about the solution, I just
implemented it, for my own satisfaction. I felt much better.
:-)
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
More information about the Info-vax
mailing list