[Info-vax] Shadow System Disk between two Hosts

Tue May 15 13:45:26 EDT 2012

On 5/15/2012 12:10 AM, Christoph Gartmann wrote:
> according to the manuals it should be possible to shadow a system disk between
> two nodes. I have two Itanium boxes, connected via ethernet and forming a
> cluster under OpenVMS 8.4. Currently each has its own system disk. Is it
> possible to have these disks form a shadow set and still have each node boot
> from a single member of this disk?

It's possible if the storage is accessible from both systems without 
having to go through the other node's MSCP Server, which is only 
available while the system is up and running. So booting this way from 
SAN disks where the SAN is directly accessible from both systems is 
fine. Local SCSI disks are problematic, because as other posters have 
pointed out, when a node goes down it's local disk is thrown out of the 
shadowset by the other node (and gets out-of-date) because it is no 
longer accessible via the MSCP Server once its host node goes down. When 
the host reboots from its local [former] member of the shadowset, all is 
well until VMS realizes that it has just booted from an out-of-date copy 
of the shadowset, and then you get the SHADDETINCON bugcheck.

I used to tell customers it was impossible in practice to boot from a 
system disk shadowset made of locally-attached disks. Then a customer 
from Europe pointed out he had been doing it for years in a 3-site 
disaster-tolerant cluster. Each node was set up as a satellite node, 
booting from the network by default when it rebooted, and each node was 
also a boot server. In the event of a total cluster reboot, you had to 
boot the very first node in the cluster conversationally from it's local 
copy of the system disk, but then the rest of the cluster could boot 
normally over the network. As a node booted in, it added its local 
shadowset member to the system disk shadowset, so once everything was up 
and running and the shadow copies completed, there was a 3-member system 
disk shadowset made up of 3 disks only accessible via the MSCP Server 
from other sites. If the original boot node went away, it didn't matter 
because the nodes were already up and running independently, and after 
the initial boot they could access members of the system disk shadowset 
independently.

We have customers with two-site disaster-tolerant OpenVMS clusters where 
the SAN is extended between sites, and some of them choose to boot from 
a common system disk shadowset shadowed across sites. The only caveat I 
have is that with a single system disk for the cluster, it can represent 
a single point of failure for the entire cluster in the event that 
someone accidentally deleted a crucial file or something. So we 
recommend that if you have only a single system disk in the cluster that 
you keep a current backup copy online and accessible that you could 
quickly reboot from if the regular single system disk shadowset were 
damaged.