[Info-vax] Distant Cluster?

Phillip Helbig---undress to reply helbig at astro.multiCLOTHESvax.de
Sat Oct 6 15:25:18 EDT 2012


In article <k4pl2v$kks$1 at dont-email.me>, Stephen Hoffman
<seaohveh at hoffmanlabs.invalid> writes: 

> You mention "distant" in the title.  How distant?  The 
> officially-supported cluster distance is 500 miles / 800 kilometers 
> with the HP default cluster configuration support.  Longer spans can 
> and do work, though you'll (officially) want to work with HP if you 
> need formal support.

He did say "in another building".  If it were in another country, I 
think he would have mentioned that.  :-)

> I could ask for justification for the fossil version, but I really 
> don't care what fig leaf somebody will propose.

The old stuff does the job and is paid off while new kit would be less 
reliable and would cost extra money?  Difficulty of getting a quote, 
much less a sensible price for an academic institution, from HP?  Lack 
of public commitment to VMS on the part of HP makes it difficult to make 
a business case for further investment involving new hardware?

> > Now it is possible to shutdown all the Alphas, the Itanium boxes continue to
> > run and vice versa. But if the network connection between the two sites drops,
> > the Itaniums crash. Why?
> 
> Because you haven't yet configured a stable network, or haven't added a 
> parallel/redundant connection?  Or because your network is getting 
> overloaded, and  the load of (for instance) shadowing is saturating the 
> wire?  Or wasn't that the (intended) question?

I don't think so.  Normally, one shouldn't be able to switch off part of 
a cluster and, at another time, what is left---partitioned cluster etc.

> FWIW, the hosts usually crash when the network connection resumes, 
> because they're determined to the "outliers" in the cluster and 
> CLUEXIT; you ended up with a partition, and the Alpha boxes "won" the 
> decision during the reconnection process.
> 
> Why does this happen?  Partitioning.  Or more specifically, avoiding 
> partitioning.

Right.  I was puzzled by the fact that the boxes "kept running".  Maybe 
the fans were running, but the OS noticed the partition.




More information about the Info-vax mailing list