[Info-vax] Quorum disk woes

etmsreec at gmail.com etmsreec at gmail.com
Fri May 31 07:42:54 EDT 2019


On Thursday, 23 May 2019 08:19:44 UTC+1, Marc Van Dyck  wrote:
> Bob Gezelter laid this down on his screen :
> > On Wednesday, May 22, 2019 at 4:57:29 AM UTC-4, Marc Van Dyck wrote:
> >> As explained in the previous post, I have several clusters made of two
> >> members and a quorum disk.
> >> 
> >> From time to time - not always, but often enough to be worried - we 
> >> have
> >> issues when the entire cluster must be rebooted. This happens at least
> >> for times a year to apply patches.
> >> 
> >> What sometimes happens is, when starting the first node, it gets stuck
> >> at the "Waiting to form or join a VMS cluster" phase, as if the quorum
> >> disk vote was not taken into account. I have to boot the second node
> >> to unlock the situation. As soon as the second node joins, the quorum 
> >> is
> >> met and the boot sequences complete. Curiously, when both systems are
> >> up, the DCL command SHOW CLUSTER displays the quorum disk as expected.
> >> 
> >> We've played a bit with that and understood that in this case, it is
> >> enough to delete the file quorum.dat on the quorum disk. It will be
> >> promptly re-created by VMS and the next boot will happen fine.
> >> 
> >> What I'd like to understand is
> >> 
> >> - What does the quorum.dat file contain ? DUMP does not display
> >> anything.
> >> 
> >> - In which circumstances might this file become corrupt or incorrect ?
> >> 
> >> Thank you,
> >> Marc.
> >> 
> >> -- 
> >> Marc Van Dyck
> >
> > Marc,
> >
> > Are you sure that the quorum disk is directly accessible by the primary 
> > system? What does the console display?
> >
> > - Bob Gezelter, http://www.rlgsc.com
> 
> The quorum disk is on a SAN storage, the same as the boot disks. Most
> our reboots are fine, it's only once in a while that it gets stuck.
> 
> I will try and go in the consoles archives of Cockpit to extract a
> console log and will post it here.
> 
> -- 
> Marc Van Dyck

Are you also sure that the voting scheme is correct?  If the config is two VMS nodes and a quorum disk, I would set one vote to each of the nodes, one to the quorum disk, and expected votes of two.  The first node up would then contribute a vote, and the quorum disk would also be taken as a vote.
I could easily understand a problem if expected votes was set too high on one of the cluster nodes.
The problem might also be that the systems are sometimes having trouble seeing the quorum disk.  Is the zoning correct, so both nodes can see the quorum disk directly?  Are all of the paths to the quorum disk valid and working?

Steve



More information about the Info-vax mailing list