[Info-vax] troubles with quorum disk
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Thu Jan 10 20:02:52 EST 2013
QDSKVOTES=1 would have been better, yes. Any preferably odd value will
work, though.
On 2013-01-10 23:40:18 +0000, Michael Moroney said:
> Keith Parris <keithparris_deletethis at yahoo.com> writes:
>
>> QDSKVOTES=3 would indeed allow the survival of one host in a 2-node
>> cluster, but then so would any value of QDSKVOTES greater than zero.
>> However, any value over 1 makes the quorum disk into a single point of
>> failure for the cluster.
True.
Though with dinky FC clusters, having the quorum disk around (and
usually also configured as the system disk) on an EVA generally only
leaves you dependent on the Fibre Channel and the EVA; on components
you're already dependent on.
>
>> The rule of thumb is: if you want to be able to boot a cluster with any
>> single node by itself up to as many as all the nodes, and never want to
>> worry about quorum, and you also don't want the quorum disk itself to
>> become a single point of failure, then set the quorum disk's votes to
>> one less (n-1) than the number of votes from the VMS systems. This
>> allows the cluster to survive loss of the quorum disk as long as all the
>> VMS nodes are present.
Yes.
> That is correct. If you have a cluster with n nodes and you want it to
> be able to run with any combination of the nodes (from 1 to all), set
> EXPECTED_VOTES to 2n-1, each node to 1 and the quorum disk to n-1.
> For the trivial case of a 2 node cluster, it becomes 1 vote each.
>
> The quorum disk isn't a single point of failure, but it almost is.
True.
Though the quorumm disk is far less of an issue with modern storage
controllers and controller-based RAID, and an EVA can provide RAID.
Times move on, and very few production sites aren't already running
RAID-capable gear.
As for uptime, clustering has various single-points of failure within
itself. I've had various VMS hosts jam themselves due to various
weirdnesses. Sometimes the distributed lock manager. Sometimes the
EVA freaks. Sometimes the FC or the switch. For VMS, clustering can
be a good approach, and it might be the only available approach short
of rolling your own code or (as is often the case) using a
replication-capable database. If you need uptime and predictable
response time, or if you need to scale up[1], then VMS-style clustering
may not be the best approach.
> The cluster can continue without it _only_ if all nodes are present.
> (and you can't shadow the quorum disk)
Loosely-coupled clustering (rather more akin to BASE, than to ACID), or
a software quorum server box ("quorum toaster") would have been
nice-to-have features for OpenVMS (a quorum VMS box works, and is
expensive), but I digress. RAID 6, RAID 10 or better generally works
just fine for the quorum disk.
————
[1]Officially, the clustering host limit is "up" is 96 hosts. In
practice, the limiting value might be higher or lower, and it's usually
dependent on how busy certain cluster-visible objects are. It's very
easy to constrain your aggregate performance behind the performance of
your I/O or storage, for instance.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list