[Info-vax] hung program location

VAXman- at SendSpamHere.ORG VAXman- at SendSpamHere.ORG
Tue Feb 19 14:25:00 EST 2013


In article <de5d73ba-8a19-43d2-af51-27cfb4d67ed4 at fv9g2000vbb.googlegroups.com>, Tom Adams <w.tom.adams at gmail.com> writes:
>On Feb 19, 8:54=A0am, Stephen Hoffman <seaoh... at hoffmanlabs.invalid>
>wrote:
>> On 2013-02-19 13:26:44 +0000, Tom Adams said:
>>
>> > The PC is 80141918 (as shown on show proc/cont) =A0The process is in HI=
>B
>> > when it's at that address.
>>
>> > The code is well controlled in CMS so it's easy produce link maps.
>>
>> > I restarted the hung processes. =A0This hanging is a rare event that I
>> > don't know how to reproduce. =A0But the process does pause at that PC
>> > in HIB during a normal operation mode.
>>
>> Build with full machine-code listings and with full maps, and start
>> instrumenting the code.
>>
>> As a guess directed at the error...
>>
>> Look specifically at the handling of $hiber and $wake calls in the
>> source code, as code that uses $hiber can easily be broken in various
>> ways, and the end result is either a spurious $wake cycle =97 which the
>> code should always expect =97 or the code gets stuck in a $hiber quite
>> possibly because one or more $wake calls got coallesced into one $wake
>> somewhere; it's not really a lost $wake call, but it seems like it.
>>
>> The gloriously ugly work-around for these problems is adding a $schdwk
>> call into the code, and deliberately inducing a periodic spurious $wake.
>>
>> The best approach being figuring out where the $wake got lost, and
>> reviewing the asynchronous portions of the code for errors.
>>
>> --
>> Pure Personal Opinion | HoffmanLabs LLC
>
>There are no direct $hiber or $wait calls, but I use lib$wait to cause
>brief pauses.

LIB$WAIT does, indeed, invoke $HIBER and $SCHDWK.


>Can't think of where other hidden $hiber's could be, unless they
>happen in QIO calls.
>
>The process does QIO calls to establish network and/or serial links.
>It most
>likely hung under conditions where it was suppose to be retrying to
>establish a link for
>weeks on end, because we only hook up the device it's trying to link
>to about
>once a month.

A $QIO would, more likely, cause you to be put into an event flag wait
state (LEF or CEF), not HIB.

-- 
VAXman- A Bored Certified VMS Kernel Mode Hacker    VAXman(at)TMESIS(dot)ORG

Well I speak to machines with the voice of humanity.



More information about the Info-vax mailing list