[Info-vax] hung program location

David Froble davef at tsoft-inc.com
Tue Feb 19 17:21:36 EST 2013


Tom Adams wrote:

> There is only one AST programmed in. It's a resource wait AST that
> only fires when the system is telling the process to shutdown, so it
> did not cause the problem.  None of the QIOs or QIOWs use ASTs.

If you're using QIO, and don't have a completion AST routine, well, I 
know nothing, but I've had the impression that a completion AST routine 
is rather normal to signal the completion of the QIO.

As for QIOW, that's pretty easy.

> My theory is that the process got stuck in a LIB$WAIT call, but I
> don't know why that would happen.  But it could be that the program
> gets into HIB somewhere else in the code processing, and there is the
> possibility that I am overlooking some bug that would screw up a LIB
> $WAIT.

A quick look at LIB$WAIT has me thinking it's basically setting a timer 
AST and then invoking $HIBER.  You may not have set up the AST, but the 
library routine could have done so.

One thing I noticed, the "seconds to wait" is a floating point number, 
and maybe a half dozen different floating types, F, D, G, etc.  F seems 
to be the default.  I can see some circumstances where a D_float is 
passed, By Reference, and might usually work, since the D_Float is 8 
bytes and the F_float is 4 bytes.  Then there will come a time when the 
wrong type af argument might cause a problem.  Not saying this is your 
problem, but I sure would check it.

> One odd thing is that the same process hung on three different
> Alphas.   But I don't know if they all got hung at the same time.  The
> three process would all be trying to get (or had or lost) a network
> connection to the same IP address of the same analyzer.  The analyzer
> is turned off and on and moved around to different physical connection
> points.   The code has been stable for a long time, but the practice
> of moving devices around like this is kind of  a new practice.

Don't know what you're doing, but, if the program is assuming it will 
find a device, without that normally being the case, perhaps some 
pre-processing to determine whether the device is available might be 
helpful.

I'm sure you don't want to re-write parts of your program, you just want 
it to stop getting wedged. But the latter may require some of the former.



More information about the Info-vax mailing list