[Info-vax] Production VMS cluster hanging with lots of LEFO

IanMiller gxys at uk2.net
Fri Mar 13 07:41:22 EDT 2009


On 13 Mar, 12:10, VAXman-  @SendSpamHere.ORG wrote:
> In article <dd549999-4436-4e6a-98fa-dbeb612d7... at w35g2000yqm.googlegroups.com>, filip.debl... at proximus.net writes:
>
> >{...snip...}
>
> >A lot of unknowns are still left.
>
> >Q : what caused the image activator to go into LEFO (actual to remain
> >in
> >LEFO). At some point during image activation (last phase ?) it starts
> >waiting for an eventflag. What could be setting that event flag ? I am
> >suspecting it never came ...
>
> The "image activator" was in LEFO!  I didn't know there was a separate
> process known as the "image activator".
>
> What caused processes to go into LEFO is the VMS process scheduling's
> algorithms and the SWAPPER.  These processes were selected as candid-
> ates for outswapping because the system determined it needed to free
> up memory for something else on the system.
>
> >Q: crashing (and rebooting) the quorum node solved things immediately.
> >Could
> >this be caused by a lock held by the quorum node ? if so, is this a
> >lock that is
> >related to cluster transitions ?
>
> If the quorum node isn't doing anything, it's highly unlikely that it
> held any lock(s) that caused your issue.
>
> >Q : would we have had the same effect by crashing/rebooting anyone of
> >the
> >other nodes ?
>
> Don't know.  
>
> >And finally :
>
> >Can some form of (minor ?) network outage trigger events like this?
>
> Perhaps.
>
> >Any takers ?
>
> Next time this should happen, force a crash DUMP to be written such
> that there is post mortem information to analyze the cause of your
> problem(s).
>
> What you did by stopping the LEFO processes was akin to going into
> battle and shooting the already dead.  I would have looked for one
> or more processes on the system consuming memory.  
>
> Did you SHOW MEMORY[/PHYSICAL]?   If so, what did you see on the
> FREE and MODIFIED lists?
>
> I'd guess that the FREE list was low and the system went into action
> to free up pages by outswapping dormant processes.
>
> --
> VAXman- A Bored Certified VMS Kernel Mode Hacker    VAXman(at)TMESIS(dot)ORG
>
>  http://www.quirkfactory.com/popart/asskey/eqn2.png
>
>   "Well my son, life is like a beanstalk, isn't it?"



I would wonder about some lock on the system disk blocking activity. A
crash dump would be useful.



More information about the Info-vax mailing list