[Info-vax] Cluster hang on node reboot
Kerry Main
kemain.nospam at gmail.com
Thu Jun 16 09:48:54 EDT 2016
> -----Original Message-----
> From: Info-vax [mailto:info-vax-bounces at info-vax.com] On Behalf Of
> Martin Vorlaender via Info-vax
> Sent: 16-Jun-16 7:28 AM
> To: info-vax at info-vax.com
> Cc: Martin Vorlaender <martinvorlaender at gmail.com>
> Subject: [New Info-vax] Cluster hang on node reboot
>
> Hi all!
>
> I have experienced an issue with a customer's VMS cluster I have no
> explanation for.
> The cluster consists of 2 rx2800 i2 + 1 DS25 for the quorum. The rx's
HBVS
> disks are
> provided by 2 3Pars. The rx's are running VMS V8.4 + UPDATE V11.0 +
> FIBRE_SCSI V9.0 .
>
> When one of the rx2800 reboots and re-joins the cluster, there is a 2
> minute hang
> of the entire cluster. I ran a TCP/IP ping from the remaining rx2800
to
> another
> system during this time which didn't lose a packet, and another DCL
> session still
> took commands, but a simple SHOW DEVICE D issued during that time
> hangs, and comes
> back with its expected output afterwards. There were no OPCOM
> messages during that
> period (the last one issued before it being the cluster transition
> completion).
>
> I'd suspect that access to the (shadowed) common system disk is
> blocked for those
> 2 minutes. I had minimerge enabled and DOSD parameters set up, but
> without really
> moving the dump files off the system disk (i.e. DUMPFILE_DEVICE set to
> the DGA devices
> of the system disk shadow set). I have switched it off since then, but
> didn't have
> a chance to test whether that was the reason for the hang.
>
> Any ideas?
>
> TIA,
> Martin
Availability Manager is a good cluster management tool to have in place
for Issues like this. Small charge on Integrity, free for Alpha.
Regards,
Kerry Main
Kerry dot main at starkgaming dot com
More information about the Info-vax
mailing list