[Info-vax] hung program location
David Froble
davef at tsoft-inc.com
Tue Feb 19 17:21:36 EST 2013
Tom Adams wrote:
> There is only one AST programmed in. It's a resource wait AST that
> only fires when the system is telling the process to shutdown, so it
> did not cause the problem. None of the QIOs or QIOWs use ASTs.
If you're using QIO, and don't have a completion AST routine, well, I
know nothing, but I've had the impression that a completion AST routine
is rather normal to signal the completion of the QIO.
As for QIOW, that's pretty easy.
> My theory is that the process got stuck in a LIB$WAIT call, but I
> don't know why that would happen. But it could be that the program
> gets into HIB somewhere else in the code processing, and there is the
> possibility that I am overlooking some bug that would screw up a LIB
> $WAIT.
A quick look at LIB$WAIT has me thinking it's basically setting a timer
AST and then invoking $HIBER. You may not have set up the AST, but the
library routine could have done so.
One thing I noticed, the "seconds to wait" is a floating point number,
and maybe a half dozen different floating types, F, D, G, etc. F seems
to be the default. I can see some circumstances where a D_float is
passed, By Reference, and might usually work, since the D_Float is 8
bytes and the F_float is 4 bytes. Then there will come a time when the
wrong type af argument might cause a problem. Not saying this is your
problem, but I sure would check it.
> One odd thing is that the same process hung on three different
> Alphas. But I don't know if they all got hung at the same time. The
> three process would all be trying to get (or had or lost) a network
> connection to the same IP address of the same analyzer. The analyzer
> is turned off and on and moved around to different physical connection
> points. The code has been stable for a long time, but the practice
> of moving devices around like this is kind of a new practice.
Don't know what you're doing, but, if the program is assuming it will
find a device, without that normally being the case, perhaps some
pre-processing to determine whether the device is available might be
helpful.
I'm sure you don't want to re-write parts of your program, you just want
it to stop getting wedged. But the latter may require some of the former.
More information about the Info-vax
mailing list