[Info-vax] Cluster hang on node reboot
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Thu Jun 16 09:36:39 EDT 2016
On 2016-06-16 11:28:23 +0000, Martin Vorlaender said:
> I have experienced an issue with a customer's VMS cluster I have no
> explanation for.
> ...
> Any ideas?
Check the consoles on the other systems around the cluster
reconfiguration looking for votes and related, and the shutdown output
and the startup output or logging, looking for clues. Look
specifically for errors, disk dismounts that are not clean (with the
ensuing rebuilds), and for the timing of the cluster transition
messages.
Check that the cluster parameters are consistent across all hosts, the
cluster timers are set appropriately, that the VAXCLUSTER parameter is
set to 2 everywhere, and that EXPECTED_VOTES is (presumably) 3
everywhere.
Also check the error logs, as I've seen some Itanium systems logging
blizzards of errors there, but not via SHOW ERROR. Probably not the
case here, but worth a look.
I'll assume that the AlphaServer DS20 shares absolutely nothing with
the cluster, save for the cluster group identifiers and password.
The OpenVMS clustering management user and configuration interface is
unfortunately rather poor, and the logging is all over the place, so
this troubleshooting tends to be a slog.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list