[Info-vax] Reverse telnet printers (TNAxxxx:) owned by non-existent process
Jan-Erik Söderholm
jan-erik.soderholm at telia.com
Wed Jun 2 04:15:14 EDT 2021
Den 2021-06-02 kl. 05:48, skrev Phil Howell:
> On Wednesday, 2 June 2021 at 2:22:50 am UTC+10, sc... at snadow.com wrote:
>> Hello everyone, after the odd DCL "READ/TIMEOUT=n" issue that I posted a couple of weeks ago, I have another problem to ask about. Unfortunately, unlike the last issue, I don't have any hypothesis about what's causing the problem nor how to prevent it.
>>
>> So: I'm running OpenVMS Alpha V7.3-2; HP TCP/IP V5.4 with ECO 6. I have a few hundred "reverse Telnet" printers defined on the server. These have TNAxxxx: devices created, but they do not have VMS-level queues. They're used by code that simply opens the TNA device, writes to it, and - at some point - closes it. None of them are set "spooled." The actual printers are varied: Mostly HP LaserJets and OKI laser printers, plus an assortment of other brands and models. Most use port 9100. To make things interesting, some are serial printers connected to Lantronix terminal servers, and those use other port numbers (10001 and above.) The network infrastructure is Cisco routers and switches, shared with plenty of other servers running Windows or Linux or AIX. This has been in place for years, relatively trouble free.
>>
>> But intermittently, a printer will have its TNA device decide that it is owned by a non-existent process: SHOW DEVICE/FULL will show a non-zero PID, but a null process name. SHOW PROCESS/ID=xxxxxxxx on the PID will give a non-existent process message. If we wait, rarely the problem will resolve itself, but most of the time (perhaps 95+%) it does not. When it does not clear itself, we know that we'll eventually get a phone call from users that are complaining that their reports aren't printing. This problem seems to occur on average perhaps 3 or 4 times a day. It occurs on the busier printers more often than the infrequently used printers.
>>
>> In no particular order, we've come up with three ways that we can _usually_ "fix" this:
>> 1) Delete and re-create the TNA device (TELNET /DELETE_SESSION, TELNET /CREATE_SESSION)
>> 2) From a privileged account, copy any file to the TNA device (such as COPY NL: TNAxxxx:)
>> 3) Reboot (Obviously this is not at all desirable, but it's guaranteed to work!)
>>
>> As a workaround, I've set up a DCL batch job that checks all TNA printer devices every five minutes with F$GETDVI, and if it find one with a non-zero owner PID, and F$GETJPI on that PID returns a non-existent process status, I copy NL: to the TNA: device and check the owner PID again. So far this works reliably, with the re-checked PID coming up as zero.
>>
>> But that's a hack. I'd much rather prevent the problem from occurring in the first place. Any ideas on what causes this and/or how to stop it?
>>
>> Thanks,
>> Scott
> Didn't Jan-Erick have this problem a couple of years ago?
>
He he... :-)
I was just going to write a note about that yesterday, but...
Funny someone remembers that.
Well, my view on this is that a process did an I/O operating (usually
a write, but maybe it can be a read also) against an TNA device and
then for some reason the process died when the I/O was still waiting.
Then you will get a TNA device with an "owner" that doesn't exist.
This prevents the "telnet /delete" on that port. Our usual work-around
is to edit the process startup script and use another (free) TNA device.
I have also been looking at a script that uses a range of TNA devices
and just use a new one each time the a process is restarted. Not finished.
More information about the Info-vax
mailing list