[Info-vax] Reverse telnet printers (TNAxxxx:) owned by non-existent process
Simon Clubley
clubley at remove_me.eisner.decus.org-Earth.UFP
Tue Jun 1 13:23:58 EDT 2021
On 2021-06-01, Scott Snadow <scott at snadow.com> wrote:
> Hello everyone, after the odd DCL "READ/TIMEOUT=n" issue that I posted a couple of weeks ago, I have another problem to ask about. Unfortunately, unlike the last issue, I don't have any hypothesis about what's causing the problem nor how to prevent it.
>
> So: I'm running OpenVMS Alpha V7.3-2; HP TCP/IP V5.4 with ECO 6. I have a few hundred "reverse Telnet" printers defined on the server. These have TNAxxxx: devices created, but they do not have VMS-level queues. They're used by code that simply opens the TNA device, writes to it, and - at some point - closes it. None of them are set "spooled." The actual printers are varied: Mostly HP LaserJets and OKI laser printers, plus an assortment of other brands and models. Most use port 9100. To make things interesting, some are serial printers connected to Lantronix terminal servers, and those use other port numbers (10001 and above.) The network infrastructure is Cisco routers and switches, shared with plenty of other servers running Windows or Linux or AIX. This has been in place for years, relatively trouble free.
>
> But intermittently, a printer will have its TNA device decide that it is owned by a non-existent process: SHOW DEVICE/FULL will show a non-zero PID, but a null process name. SHOW PROCESS/ID=xxxxxxxx on the PID will give a non-existent process message. If we wait, rarely the problem will resolve itself, but most of the time (perhaps 95+%) it does not. When it does not clear itself, we know that we'll eventually get a phone call from users that are complaining that their reports aren't printing. This problem seems to occur on average perhaps 3 or 4 times a day. It occurs on the busier printers more often than the infrequently used printers.
>
> In no particular order, we've come up with three ways that we can _usually_ "fix" this:
> 1) Delete and re-create the TNA device (TELNET /DELETE_SESSION, TELNET /CREATE_SESSION)
> 2) From a privileged account, copy any file to the TNA device (such as COPY NL: TNAxxxx:)
> 3) Reboot (Obviously this is not at all desirable, but it's guaranteed to work!)
>
Does power cycling the printer clear the problem ?
> As a workaround, I've set up a DCL batch job that checks all TNA printer devices every five minutes with F$GETDVI, and if it find one with a non-zero owner PID, and F$GETJPI on that PID returns a non-existent process status, I copy NL: to the TNA: device and check the owner PID again. So far this works reliably, with the re-checked PID coming up as zero.
>
> But that's a hack. I'd much rather prevent the problem from occurring in the first place. Any ideas on what causes this and/or how to stop it?
>
Does the underlying socket still exist on the VMS system and if so,
what state is the socket in ?
My guess would be that the socket close sequence has gone wrong and
that the underlying socket is stuck in some closing state.
Do you have TCP-level keepalives enabled on the system in question ?
If not, have you tried enabling them ?
Is this only on serial printers or only on network printers or is it
a mixture of the two ?
It's been a while since I used reverse Telnet printers, but is there
some timeout setting you can apply to the TNA device itself when you
create the device ?
Simon.
--
Simon Clubley, clubley at remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
More information about the Info-vax
mailing list