[Info-vax] reboot due to network failure

Phillip Helbig---undress to reply helbig at astro.multiCLOTHESvax.de
Tue Dec 13 08:26:10 EST 2011


In article <4ee74c90$0$2818$c3e8da3$fdf4f6af at news.astraweb.com>, JF
Mezei <jfmezei.spamnot at vaxination.ca> writes: 

> Phillip Helbig---undress to reply wrote:
> > Recently, my switch died, which of course froze my LAN cluster.  When I 
> > replaced it with another one (actually a hub; I plan to buy a switch 
> > this week), I saw that 2 of the 3 nodes in the cluster rebooted.  What 
> > determines whether any nodes reboot and if so which ones?
> 
> When a node reconnects with a cluster after the others kicked it out, it
> will perform hara kiri because it realises that its lock database and
> all other cluster structures are woefully out fo date.  Rebooting is the
> simpler way to get the node to rejoin cleanly.
> 
> I believe it goes by votes. If one node has quorum (or if a few nodes
> managed to stay connected during outage) then they will survive, and
> reconncting nodes will reboot.

3 nodes.  Two rebooted, one didn't.  One node can't have quorum.  Maybe 
one rebooted and came back before the other one did, so quorum was never 
lost.  Boottimes are almost 3 minutes apart.




More information about the Info-vax mailing list