[Info-vax] Sudden problems with slow sftp transfers and slow disk accesses

gwilliams at cfa.harvard.edu gwilliams at cfa.harvard.edu
Tue Feb 18 08:25:38 EST 2014


On Tuesday, February 18, 2014 8:16:17 AM UTC-5, Jan-Erik Soderholm wrote:
> gwi... at cfa.harvard.edu wrote 2014-02-18 13:53:
> 
> > On Tuesday, February 18, 2014 6:23:22 AM UTC-5, Jim wrote:
> 
> >> On Monday, February 17, 2014 11:42:27 PM UTC-5, gwil... at cfa.harvard.edu wrote:
> 
> >>
> 
> >>> Cluster of 8 Alpha boxes running V8.3 + patches, recently moved
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> from one building to another.  Since the move, we've been experiencing
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> odd behaviors: very slow network access (via sftp) and slow disk IO.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> Disk storage is mostly on three eternal disk boxes (five three-member
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> shadow sets).  No errors are reported via SHOW DEV DSA.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> Network cards on all machines are set to 100 MB, full duplex,
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> non-autonegotiate, connected via an 8-port GB switch.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> An sftp from one of our machine to a local Linux system transferred 288 KB
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> of a ~ 6MB file in the first second, the current rate is now down to 5KB/s.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> MONITOR PROCESS/TOPCPU doesn't show the process getting even 1% of the
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> CPU and there is nothing else running on the system.  Another attempt on
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> the same file transferred 1.5 MB in the first second, then dropped to
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> < 50 KB/s.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> SHOW MEMORY doesn't show any problems.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> A filing operation merging two large files took a matter of seconds
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> when both files were on a locally-attached disk.  When both files were
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> on a shadow set, the merge took 6+ minutes.  MONITOR LOCK while
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> running the latter test showed ENQ/DEQ rates < 1.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> Image activation is slow.  It can take several seconds to begin
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> running an .exe stored on the shadow set.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> MCR SCACP SHOW LAN_DEV/ALL showed numerous errors occurring over the
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> past 24 hours, so this evening we replaced the switch connecting these
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> 8 machines.  Errors are continuing to appear.
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> What am I missing?
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>>
> 
> >>
> 
> >>> Gareth
> 
> >>
> 
> >>
> 
> >>
> 
> >> The output of the following command might be interesting
> 
> >>
> 
> >>
> 
> >>
> 
> >> $ MCR LANCP SHOW DEVICE/INTERNAL Exxx ! where Exxx is the suspect NIC
> 
> >
> 
> > Contrary to what DO MCR LANCP SHOW DEV EW/CHAR shows in SYSMAN
> 
> > (all interface cards set to Full duplex enable YES, Full duplex
> 
> > operational YES, 100 Mb/s), most of the interfaces carrying IP
> 
> > traffic display "possible duplex mismatch" when running the SHOW
> 
> > DEV/INT command.  The driver messages all show "Link State: UP"
> 
> > and "Full Duplex 100base TX connection selected". I have reconfigured
> 
> > all interfaces, but I'm still seeing sftp issues (the 6 MB file
> 
> > mentioned earlier transferred 2.8 MB in the first second, then
> 
> > trailed off).  The next driver message isn't due for another 30 minutes
> 
> > or so.
> 
> >
> 
> > Gareth
> 
> >
> 
> 
> 
> Do you not read all replies?
> 
> 
> 
> This is probably *NOT* an issue with your Alpha servers or
> 
> with OpenVMS. CHECK YOUR SWITCHES SETTINGS!
> 
> 
> 
> "possible duplex mismatch" is not that hard to understand.
> 
> 
> 
> Your problems with (s)ftp is exactly what is expected when
> 
> the switch runs half-duplex. Terminal sessions runs OK, ftp
> 
> of small files works but larger file transfers "hangs".
> 
> 
> 
> 
> 
> Jan-Erik.

The switch is an auto-sensing Netgear 16-port FS116.  I don't see how
I can check what the switch is set to.




More information about the Info-vax mailing list