[Info-vax] Loosing all LAT connections on one machine in DECNet Network

Kari Uusimäki uusimaki at exdecWITHOUTTHISfinland.org
Wed Apr 15 17:16:35 EDT 2009


JCamCMKRNL wrote:
> I have a Dedicated network (10-Base5 ethernet) which only runs DECNet
> traffic on it. This network has 5 Machines on it (four PDP-11/RSX11M+
> and one VAX/VMS) and several DECServer 200MC terminal servers.
> 
> Recently on a few occasions (three times in the last three weeks) all
> the LAT connections to the one VMS Machine are dropped. The users who
> were logged in are disconnected, and their interactive process
> terminates, returning them to the DECServer "Local>" prompt. When they
> enter in the DECServer command "SHOW SERVICE" we see all the other
> machines, but not the VMS Machine.
> 
> The VMS machine is still running fine, and there are no additional
> device errors. We can still log into any one of the PDP-11 systems
> through LAT on the DEC Servers, and then connect to the VMS system via
> the SET HOST command, but still cannot connect to the VMS system via
> LAT.
> 
> There were no Network messages logged to OPCOM, and the LATACP process
> is still running. Later, time passes (sometimes several minutes, and
> other times several hours) and the service returns with no apparent
> reason or operator interaction. On one of these occasions we performed
> a system reboot of the VMS system, but this did not restore the LAT
> service for the system immediately. Instead, the LAT service came back
> about 15 minutes after the reboot.
> 
> I tried using the LATCP program to see if I could do anything when it
> was in this state, and it seemed to be working fine, but I still could
> not connect from any terminal server.
> 
> Our basic topology is: All the 5 systems and one DECServer are in data
> center connected to a DELNI hub, which in-turn is connected to the
> thickwire 10-Base5 backbone. This Backbone runs throughout the plant
> to several IDF communications closets. Some of these closets have one
> DECServer connected to the Backbone directly via a transceiver and AUI
> cable, and others have Multiple DECServers connected to a DELNI which
> in-turn is connected via an AUI Cable to the Backbone transceiver.
> 
> When we loose all the connections to the one VAX system the LAT
> service disappears on all DECServers throughout the plant, including
> the one in the data center which is connected to the one DELNI with
> the 4 PDP-11s and one VAX system.
> 
> I have DECNet network logging turned "ON" on the VAX, and see no
> network OPCOM messages. I see no feature in LATCP to enable logging.
> 
> Based on this information I am reasonably sure that I am not dealing
> with a hardware issue, because I can still connect to the PDP-11
> systems via LAT and then connect to the VAX via SET HOST.
> 
> Is there a log file for the LATACP process?
> Does anybody have any ideas as to how I might Diagnose this issue?
> 
> Any ideas would be deeply appreciated.
> Jeff Cameron


When the LAT service of the VMS machine had disappeared from the 
DECservers, did you still try to do connect to the service somehow (e.g. 
SET H /LAT <VAX>)? If you did, was it successful?
The LAT protocol can try to find a service on the LAN even if it is not 
announced yet, but a node is servicing it.

When the LAT service was unavailable, was it possible to do a SET H 
<PDP11> from the VMS machine?

Because the line counters on the VMS machine show transmit errors, but 
the other machines on the same DELNI work fine at the same time, I 
suggest you try the following actions:
1) replace the transceiver cable between the VAX and the DELNI
2) connect the transceiver cable from the VAX to another port in the DELNI
3) replace the NIC on the VAX.

IIRC there has been rare occasions when the NIC can intermittently stop 
transmitting, but is still receiving packets.

Anyway, you might get a better view of what's happening on the LAN if 
you can connect a sniffer to your LAN somewhere. You might find a DE450 
adapter (for the PCI bus, which has all three ethernet interface 
versions) and put it into a PC and then run a sniffer software.
You should then see all the LAT traffic going to and from the VAX. 
Usually you can filter out all the other traffic to concentrate on LAT 
only. LAT is a very informative protocol in LAN troubleshooting, because 
it is very time critical and sensitive to even slight LAN problems. 
Another protocol which is also sensitive to LAN errors is MOP. If you 
have LAN trouble, usually both LAT and MOP show errors.
TCP/IP is very bad in revealing LAN problems, because it usually only 
becomes slower, but doesn't cut the connections easily. Many TCP/IP 
users might have lots of LAN errors, but they only suffer from slow 
response and therefore the real errors aren't fixed. Another bad thing 
about it is that there is no management part in the protocol as in DECnet.

The most interesting part is when the LAT connections are broken and 
what happens at that time. Does the outbound LAT traffic from the VAX 
stop suddenly and therefore the connections are dropped or are the 
connections closed in an ordinary way by both parts of the connection?
In the first case the hardware is more susceptible and in the latter 
case the LAT software or the microcode in the NIC.




Regards,

Kari



More information about the Info-vax mailing list