[Info-vax] Quorum strategy

Marc Van Dyck marc.gr.vandyck at invalid.skynet.be
Wed May 22 04:44:59 EDT 2019


Our production environment is made of 3 clusters of two members each,
with a quorum disk. Each member has one vote, the quorum disk also,
so expected votes is 3 and the quorum is 2.

Now, in our environment, one of the two cluster members is more
important than the other because there are application parts that
can run on this one only. We call this the primary member. The other 
one
is the secondary member.

If case of system failure, if it is the secondary member that fails, we
can just switch the applications that ran on it to the primary. The
only price to pay will be a load more difficult to handle.

If the primary member fails and can't be restarted, we shut down the
secondary too and restart it as primary. All data are on SAN storage,
including the system disk. So it is just a matter of shutting down,
change the boot flags, and restart.

Before you start with that, yes, I know that this is bad, that
applications should not behave like that, but that's how it is and I
can't change it. What I can do is minimize the occurrences of that
situation.

So, in case of the cluster connection manager will have to make a 
choice
between the two members (for example if the two nodes can't talk with
each other anymore), I'd like to make sure that it's the primary member
that will always survive.

I found on the 'net an old presentation explaining how the connection
manager works. In such cases, it seems that it will first try to
maximize the number of surviving nodes, and if that's not enough to
make a choice, it will then try to maximize the number of remaining
votes.

So, say that I give

- 2 votes on the primary member
- 2 votes on the the quorum disk
- 1 vote on the secondary member

Expected votes is 5, quorum is 3.

If I lose the

- primary member, I keep 2 + 1 votes => quorum kept, cluster survives
- secondary member, I keep 2 + 2 votes => quorum kept, cluster survives
- quorum disk, I keep 2 + 1 votes => quorum kept, cluster survives

So this would work at least as well as the existing 1/1/1 votes config
that I have now.

But if primary and secondary members lose contact with each other,
one of the two will be ejected, and as primary + quorum = 4 votes
and secondary + quorum = 3 votes, the primary member should always be
kept, and the secondary one ejected.

First question : did I get that right ?
Second question : did I miss anything ?

Many thanks,
Marc.

-- 
Marc Van Dyck



More information about the Info-vax mailing list