[Info-vax] Quorum disk woes
Marc Van Dyck
marc.gr.vandyck at invalid.skynet.be
Sun Jun 2 05:55:59 EDT 2019
etmsreec at gmail.com used his keyboard to write :
> On Thursday, 23 May 2019 08:19:44 UTC+1, Marc Van Dyck wrote:
>> Bob Gezelter laid this down on his screen :
>>> On Wednesday, May 22, 2019 at 4:57:29 AM UTC-4, Marc Van Dyck wrote:
>>>> As explained in the previous post, I have several clusters made of two
>>>> members and a quorum disk.
>>>>
>>>> From time to time - not always, but often enough to be worried - we
>>>> have
>>>> issues when the entire cluster must be rebooted. This happens at least
>>>> for times a year to apply patches.
>>>>
>>>> What sometimes happens is, when starting the first node, it gets stuck
>>>> at the "Waiting to form or join a VMS cluster" phase, as if the quorum
>>>> disk vote was not taken into account. I have to boot the second node
>>>> to unlock the situation. As soon as the second node joins, the quorum
>>>> is
>>>> met and the boot sequences complete. Curiously, when both systems are
>>>> up, the DCL command SHOW CLUSTER displays the quorum disk as expected.
>>>>
>>>> We've played a bit with that and understood that in this case, it is
>>>> enough to delete the file quorum.dat on the quorum disk. It will be
>>>> promptly re-created by VMS and the next boot will happen fine.
>>>>
>>>> What I'd like to understand is
>>>>
>>>> - What does the quorum.dat file contain ? DUMP does not display
>>>> anything.
>>>>
>>>> - In which circumstances might this file become corrupt or incorrect ?
>>>>
>>>> Thank you,
>>>> Marc.
>>>>
>>>> --
>>>> Marc Van Dyck
>>>
>>> Marc,
>>>
>>> Are you sure that the quorum disk is directly accessible by the primary
>>> system? What does the console display?
>>>
>>> - Bob Gezelter, http://www.rlgsc.com
>>
>> The quorum disk is on a SAN storage, the same as the boot disks. Most
>> our reboots are fine, it's only once in a while that it gets stuck.
>>
>> I will try and go in the consoles archives of Cockpit to extract a
>> console log and will post it here.
>>
>> --
>> Marc Van Dyck
>
> Are you also sure that the voting scheme is correct? If the config is two
> VMS nodes and a quorum disk, I would set one vote to each of the nodes, one
> to the quorum disk, and expected votes of two. The first node up would then
> contribute a vote, and the quorum disk would also be taken as a vote. I could
> easily understand a problem if expected votes was set too high on one of the
> cluster nodes. The problem might also be that the systems are sometimes
> having trouble seeing the quorum disk. Is the zoning correct, so both nodes
> can see the quorum disk directly? Are all of the paths to the quorum disk
> valid and working?
>
> Steve
Unless I completely misunderstood, expected_votes is the number of
votes
observed when the whole cluster is up. If each member as one vote, and
the quorum disk has one vote too, then expected_votes should be 3. Then
for such a configuration the calculated quorum is 2. When booting one
member and with the quorum disk present, we have 2 votes so the quorum
is gained and the cluster can be alive.
Regarding the visibility of the quorum disk, I think it's OK but just
to
be sure, tomorrow I will extract a console log and post it here.
--
Marc Van Dyck
More information about the Info-vax
mailing list