[Info-vax] Shadow System Disk between two Hosts
Keith Parris
keithparris_deletethis at yahoo.com
Tue May 15 13:45:26 EDT 2012
On 5/15/2012 12:10 AM, Christoph Gartmann wrote:
> according to the manuals it should be possible to shadow a system disk between
> two nodes. I have two Itanium boxes, connected via ethernet and forming a
> cluster under OpenVMS 8.4. Currently each has its own system disk. Is it
> possible to have these disks form a shadow set and still have each node boot
> from a single member of this disk?
It's possible if the storage is accessible from both systems without
having to go through the other node's MSCP Server, which is only
available while the system is up and running. So booting this way from
SAN disks where the SAN is directly accessible from both systems is
fine. Local SCSI disks are problematic, because as other posters have
pointed out, when a node goes down it's local disk is thrown out of the
shadowset by the other node (and gets out-of-date) because it is no
longer accessible via the MSCP Server once its host node goes down. When
the host reboots from its local [former] member of the shadowset, all is
well until VMS realizes that it has just booted from an out-of-date copy
of the shadowset, and then you get the SHADDETINCON bugcheck.
I used to tell customers it was impossible in practice to boot from a
system disk shadowset made of locally-attached disks. Then a customer
from Europe pointed out he had been doing it for years in a 3-site
disaster-tolerant cluster. Each node was set up as a satellite node,
booting from the network by default when it rebooted, and each node was
also a boot server. In the event of a total cluster reboot, you had to
boot the very first node in the cluster conversationally from it's local
copy of the system disk, but then the rest of the cluster could boot
normally over the network. As a node booted in, it added its local
shadowset member to the system disk shadowset, so once everything was up
and running and the shadow copies completed, there was a 3-member system
disk shadowset made up of 3 disks only accessible via the MSCP Server
from other sites. If the original boot node went away, it didn't matter
because the nodes were already up and running independently, and after
the initial boot they could access members of the system disk shadowset
independently.
We have customers with two-site disaster-tolerant OpenVMS clusters where
the SAN is extended between sites, and some of them choose to boot from
a common system disk shadowset shadowed across sites. The only caveat I
have is that with a single system disk for the cluster, it can represent
a single point of failure for the entire cluster in the event that
someone accidentally deleted a crucial file or something. So we
recommend that if you have only a single system disk in the cluster that
you keep a current backup copy online and accessible that you could
quickly reboot from if the regular single system disk shadowset were
damaged.
More information about the Info-vax
mailing list