[Info-vax] Distant Cluster?
Christoph Gartmann
gartmann at nonsense.immunbio.mpg.de
Tue Oct 9 03:04:48 EDT 2012
In article <k4vtmd$9db$1 at dont-email.me>, David Froble <davef at tsoft-inc.com> writes:
>Michael Moroney wrote:
>> gartmann at nonsense.immunbio.mpg.de (Christoph Gartmann) writes:
>>
>>> In article <b9dcb6a6-8b38-4729-9b1e-abac00f3bba1 at googlegroups.com>, Ken Fairfield <ken.fairfield at gmail.com> writes:
>>
>>>> There are no inherent problems with a long RECNXINTERVAL other
>>>> than (some vague memories I have of) lengthened cluster transistion
>>>> times.
>>
>>> Good to know.
>>
>> The question you have to ask yourself is whether you or your users can
>> tolerate random "hangs" by the entire cluster for up to RECNXINTERVAL
>> seconds, pretty much any time there is a network glitch such as rebooting
>> a switch. Because that is what wil happen until things resolve themselves
>> or some node(s) get kicked out of the cluster.
>>
>> Default RECNXINTERVAL is 20 seconds.
>
>That is a timeout value, and only comes into play when the link is down.
> If the cluster is broken, aren't the users hosed anyway? I'd rather
>they take a short break and come back to where they left off.
The problem wouldn't be the two minute reboot time for the switch but the time
the rx6600s take to write a dump and reboot (about 10 minutes).
>If it happens often, then perhaps the core problem should be addressed.
It doesn't happen "often" (switch firmware upgrade) but it happens.
>There is "doable" and then there is "prudent". I'd think a private
>direct link would be prudent.
That's the way to go but it will take some time. In the meantime RECNXINTERVAL
is an option.
Regards,
Christoph Gartmann
--
Max-Planck-Institut fuer Phone : +49-761-5108-464 Fax: -80464
Immunbiologie und Epigenetik
Postfach 1169 Internet: gartmann at immunbio dot mpg dot de
D-79011 Freiburg, Germany
http://www.immunbio.mpg.de/home/menue.html
More information about the Info-vax
mailing list