[Info-vax] Quorum disk woes

Marc Van Dyck marc.gr.vandyck at invalid.skynet.be
Sun Jun 2 05:55:59 EDT 2019


etmsreec at gmail.com used his keyboard to write :
> On Thursday, 23 May 2019 08:19:44 UTC+1, Marc Van Dyck  wrote:
>> Bob Gezelter laid this down on his screen :
>>> On Wednesday, May 22, 2019 at 4:57:29 AM UTC-4, Marc Van Dyck wrote:
>>>> As explained in the previous post, I have several clusters made of two
>>>> members and a quorum disk.
>>>> 
>>>> From time to time - not always, but often enough to be worried - we 
>>>> have
>>>> issues when the entire cluster must be rebooted. This happens at least
>>>> for times a year to apply patches.
>>>> 
>>>> What sometimes happens is, when starting the first node, it gets stuck
>>>> at the "Waiting to form or join a VMS cluster" phase, as if the quorum
>>>> disk vote was not taken into account. I have to boot the second node
>>>> to unlock the situation. As soon as the second node joins, the quorum 
>>>> is
>>>> met and the boot sequences complete. Curiously, when both systems are
>>>> up, the DCL command SHOW CLUSTER displays the quorum disk as expected.
>>>> 
>>>> We've played a bit with that and understood that in this case, it is
>>>> enough to delete the file quorum.dat on the quorum disk. It will be
>>>> promptly re-created by VMS and the next boot will happen fine.
>>>> 
>>>> What I'd like to understand is
>>>> 
>>>> - What does the quorum.dat file contain ? DUMP does not display
>>>> anything.
>>>> 
>>>> - In which circumstances might this file become corrupt or incorrect ?
>>>> 
>>>> Thank you,
>>>> Marc.
>>>> 
>>>> -- 
>>>> Marc Van Dyck
>>> 
>>> Marc,
>>> 
>>> Are you sure that the quorum disk is directly accessible by the primary 
>>> system? What does the console display?
>>> 
>>> - Bob Gezelter, http://www.rlgsc.com
>> 
>> The quorum disk is on a SAN storage, the same as the boot disks. Most
>> our reboots are fine, it's only once in a while that it gets stuck.
>> 
>> I will try and go in the consoles archives of Cockpit to extract a
>> console log and will post it here.
>> 
>> -- 
>> Marc Van Dyck
>
> Are you also sure that the voting scheme is correct?  If the config is two 
> VMS nodes and a quorum disk, I would set one vote to each of the nodes, one 
> to the quorum disk, and expected votes of two.  The first node up would then 
> contribute a vote, and the quorum disk would also be taken as a vote. I could 
> easily understand a problem if expected votes was set too high on one of the 
> cluster nodes. The problem might also be that the systems are sometimes 
> having trouble seeing the quorum disk.  Is the zoning correct, so both nodes 
> can see the quorum disk directly?  Are all of the paths to the quorum disk 
> valid and working?
>
> Steve

Unless I completely misunderstood, expected_votes is the number of 
votes
observed when the whole cluster is up. If each member as one vote, and
the quorum disk has one vote too, then expected_votes should be 3. Then
for such a configuration the calculated quorum is 2. When booting one
member and with the quorum disk present, we have 2 votes so the quorum
is gained and the cluster can be alive.

Regarding the visibility of the quorum disk, I think it's OK but just 
to
be sure, tomorrow I will extract a console log and post it here.

-- 
Marc Van Dyck



More information about the Info-vax mailing list