[Info-vax] Clustering (was: Re: free shell accounts?)
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Thu Jan 22 13:06:28 EST 2015
On 2015-01-22 06:51:30 +0000, Stan Radford said:
> On 2015-01-20, Matthew H McKenzie <news.deleteme at swellhunter.org> wrote:
>> No issues so far, but it is not a cluster, Deathrow could lose a node and
>> still be usable.
Technically, Deathrow can only lose GEIN (AlphaServer DS10L), as the
data is presently stored on JACK (Integrity rx2600). It's configured
as a primary-secondary. JACK has much more storage than does GEIN,
and there's no shared storage interconnect.
The Deathrow hardware is actually working. The owner of the cluster
hasn't sorted out why the inbound IP network connections are getting
blocked, and aren't getting to the cluster systems. Of course the
difference between operational though inaccessible hardware, and dead
hardware, is pragmatically rather irrelevant.
>> Of course recompilation was necessary across architectures
>> and not everything is 100% portable. They have a log of incidents if you
>> wish to look at recent history.
>
> I don't understand what a cluster does. If they don't have shared disks
> somewhere wouldn't they have to have multiple copies of everything? How does
> a cluster still remain usable if you are editing a file and the machine the
> file lives on fails? I can see for serving applications a cluster would be
> great but I don't understand how it helps development users. And even that
> would seem like it would take a lot of planning and wouldn't just automatically
> "work" because of the need for shared storage somewhere.
Others have pointed to the clustering manuals in the VMS documentation
set. The VMS doc set has a higher-level and introductory overview of
clustering, and a lower-level and more detailed clustering manual. Go
skim at least the upper-level introductory clustering manual.
One detail that may not have not mentioned is that host-based volume
shadowing (HBVS) works across cluster members, so even clusters that
don't have shared interconnects can have the data transparently
shadowed (also called mirroring and RAID-1) across up to six separate
volumes, with those volumes potentially located on six separate servers.
Had there been a shared storage interconnect available in the
currently-two-node Deathrow cluster, then any of the cluster members
connected on that shared interconnect could continue to operate, and —
so long as quorum is met — cluster members can enter and exit the
cluster without affecting shared data. With a two-node configuration,
quorum either needs a primary-secondary configuration, or needs a
shared cluster interconnect — for instance, multi-host parallel SCSI is
allowed as a shared interconnect, when configured in certain hardware
configurations — with what's called a quorum disk. Probably easier to
understand with quorum in a cluster is the presence of a third or
additional voting members. Two host clusters with no quorum disk are
more fragile, as you can't automatically differentiate a disconnection
from a host being down. With either three or more members, or two
members with a quorum disk on a shared interconnect, the cluster can
transparently survive the loss of any single host. The more members,
the more losses the cluster can survive, before it drops below quorum
and (intentionally) stalls to preserve the data.
Is there planning and hardware involved in clustering? Sure. Less
than you might assume, unless you're planning to try this without
having skimmed the VMS manuals.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list