[Info-vax] hung program location

Tom Adams w.tom.adams at gmail.com
Wed Feb 20 10:46:45 EST 2013


On Feb 19, 5:21 pm, David Froble <da... at tsoft-inc.com> wrote:
> Tom Adams wrote:
> > There is only one AST programmed in. It's a resource wait AST that
> > only fires when the system is telling the process to shutdown, so it
> > did not cause the problem.  None of the QIOs or QIOWs use ASTs.
>
> If you're using QIO, and don't have a completion AST routine, well, I
> know nothing, but I've had the impression that a completion AST routine
> is rather normal to signal the completion of the QIO.
>
> As for QIOW, that's pretty easy.
>
> > My theory is that the process got stuck in a LIB$WAIT call, but I
> > don't know why that would happen.  But it could be that the program
> > gets into HIB somewhere else in the code processing, and there is the
> > possibility that I am overlooking some bug that would screw up a LIB
> > $WAIT.
>
> A quick look at LIB$WAIT has me thinking it's basically setting a timer
> AST and then invoking $HIBER.  You may not have set up the AST, but the
> library routine could have done so.
>
> One thing I noticed, the "seconds to wait" is a floating point number,
> and maybe a half dozen different floating types, F, D, G, etc.  F seems
> to be the default.  I can see some circumstances where a D_float is
> passed, By Reference, and might usually work, since the D_Float is 8
> bytes and the F_float is 4 bytes.  Then there will come a time when the
> wrong type af argument might cause a problem.  Not saying this is your
> problem, but I sure would check it.
>
> > One odd thing is that the same process hung on three different
> > Alphas.   But I don't know if they all got hung at the same time.  The
> > three process would all be trying to get (or had or lost) a network
> > connection to the same IP address of the same analyzer.  The analyzer
> > is turned off and on and moved around to different physical connection
> > points.   The code has been stable for a long time, but the practice
> > of moving devices around like this is kind of  a new practice.
>
> Don't know what you're doing, but, if the program is assuming it will
> find a device, without that normally being the case, perhaps some
> pre-processing to determine whether the device is available might be
> helpful.
>
> I'm sure you don't want to re-write parts of your program, you just want
> it to stop getting wedged. But the latter may require some of the former.

I think the only way to determine if the device is available on the
network is by trying to set up communications with it.



More information about the Info-vax mailing list