[Info-vax] LAT service availability timeout after shutdown
gerry77 at no.spam.mail.com
gerry77 at no.spam.mail.com
Tue Dec 29 20:02:52 EST 2009
On Tue, 29 Dec 2009 00:09:19 -0800 (PST), Bob Gezelter <gezelter at rlgsc.com>
wrote:
> As David Sneddon noted, a DELETE SERVICE <servicename> command in
> LATCP will make the service disappear on the other nodes.
His suggestion actually resolved my problem and I'm about to write some DCL
to obtain from LATCP a list of my local services, in order to delete them
one by one because there is no wildcard support in LATCP, as you have noted.
> - SET NODE/STATE=OFF
>
> Service(s) remain "AVAILABLE" persist, even though LATACP on service
> providing node has exited.
>
> There are some other experiments that I would do if I had the time,
> which I do not at this instant. The observed behaviors appear
> consistent with LAT service announcement messages being sent upon
> Service creation/deletion, but NOT on a LAT node shutdown.
> [...]
> Then again, the list of services is not generally highly variable. One
> can get the same effect as one would using wildcards by keeping a list
> of services created and issuing a DELETE SERVICE command on each
> service during node shutdown. If shutting LATACP manually, it is up to
> those shutting down LATACP to behave in a responsible fashion and do
> the DELETE SERVICE commands.
This indeed is not a good thing IMHO, because if a node or some of the
network fails, there will be stale services lying around for a very long
time. Deleting services before shutdown is a good idea, but it does depend
upon both (good) management and correct system shutdown. Instead, some sort
of timeout (a lot shorter than 12 hours), would take into account every
problem. Moreover I've discovered that I cannot force a node to forget some
stale service, i.e. DELETE SERVICE is effective only on the offering node,
so just suppose I've a crashed system or some network problem that prevents
me from quickly restoring a LAT service, the only way I have to make it
disappear is to stop and restart the LAT driver on all the remaining nodes,
thus forcing a disruption of all the working services too. I'm still asking
myself if this so long timeout is a specific feature of LTDRIVER on OpenVMS
or not. Maybe DEC terminal servers behave differently and are quicker to
update their service list whenever they stop to receive the advertisement of
some service that has gone away.
Thank you very much for the experiments you've made for me. :)
Bye,
G.
More information about the Info-vax
mailing list