[Info-vax] Home-grown application process dumps

Wed Jan 7 15:22:20 EST 2015

On 2015-01-07 19:47:06 +0000, RGB said:

> The problem here is that we are not getting anything "normal"...

Normal?   You're not getting what you're expecting and not getting what 
you desire, certainly.   If I was unfamiliar with VMS heap corruptions 
and with the toxic effects of VMS buffer overruns and assorted 
misbehaviors of asynchronous errors, then what happens in these cases 
can be very puzzling.   Yes, the BASIC environment might not or won't 
help with these cases, as this initially appears to be some sort of a 
corruption in the heap or the stack, or maybe a synchronization bug.  
In all of these cases, the application crash can also potentially arise 
long after the error that will trigger the crash, too.  In some cases 
I've worked, sometimes even days after the trigger.

As for the code that's blowing up, if I pass a bad argument into a 
system service call, I'll see the error arise in that code.  But the 
bug is in my own code.  Heap and stack corruptions and synchronization 
bugs can show up all over the place.  A heap overwrite may not become 
visible until some other and entirely unrelated operation encounters 
the corruption, for instance.

As for working through this case, I'd do what I'd stated earlier: I'd 
review the code for latent coding errors particularly buffer handling 
and synchronization, and — particularly if it were my code, and I was 
stuck — I'd get somebody else to have a look at the code for anything 
I've missed.  I'd particularly examine RTL calls and system services 
for buffer overruns and heap corruptions, and I'd look for AST-level 
problems and for synchronization errors, and for other forms of vermin. 
   I'd also instrument the code, but I'd generally recommend integrated 
logging be implemented for any non-trivial application.  I'd instrument 
any complex application, even in the absence of errors.

I'd also work toward a reproducer for the error, or at least very 
specific details and logged data from what's transpiring here.  You 
already know what you're getting from the generic process dump 
mechanism is not sufficient to locate the trigger, after all.  
Instrumenting the code can get me closer to the error, and can help me 
gather values.  This assuming I don't yet know enough about the trigger 
to program the debugger to get to where I can view the error.

Yes, it's possible that there's a VMS error lurking here.  But when 
presented with these cases, I've learned to always suspect bugs in my 
own code first, as VMS has seen a whole lot more use than has my code.  
When presented with these sorts of weird cases, handing a specific 
error or a reproducer to HP support also means they're not recreating 
your environment and walking through all your BASIC code to create one. 
  I've found a number of these bugs in my code by trying to create that 
reproducer, too.

If this has been an on-going issue, then consider getting some help in; 
somebody to look at the code and at the application environment.  Or if 
you're sure it's a VMS bug, contact HP support, of course.

-- 
Pure Personal Opinion | HoffmanLabs LLC