[Info-vax] BL860i2 blade unexplained system disk slowdowns

Simon Clubley clubley at remove_me.eisner.decus.org-Earth.UFP
Tue Jun 7 13:31:19 EDT 2022


On 2022-06-07, Kenneth Randell <kennethnospam.randell at gmail.com> wrote:
> I have an issue with a system where the I/O to the system drive is after some period of time slowing down to the point of making the system unusable.
>
> The hardware configuration is BL860i2 blade, VMS 8.4, 16GB memory, dual-processor quad-core CPUs, 2x1.2GB disks in Raid 1 configuration, p410i controller on the motherboard is running 6.64 firmware, no cache module
>
> This system went through a power outage after a normal shutdown and power off.  After it came back up, these system disk (the raid-1 pair in the blade) slow-downs began after the system has been up anywhere from 6-14 hours.  XFC disk response times just start to increase and never decrease.  The system starts off at around 500 Useconds, then at some point (for unknown reasons) will start increasing.  Some time stamps below (this is 11 hours after the last shutdown and e-fuse).
>
>
> 7-JUN-2022 00:28:08.54    Ave Disk I/O Resp Time incl cache hits (microseconds)        385
> 7-JUN-2022 00:29:12.30    Ave Disk I/O Resp Time incl cache hits (microseconds)        471
> 7-JUN-2022 00:30:24.98    Ave Disk I/O Resp Time incl cache hits (microseconds)        559
> 7-JUN-2022 00:31:29.29    Ave Disk I/O Resp Time incl cache hits (microseconds)        669
> 7-JUN-2022 00:32:33.49    Ave Disk I/O Resp Time incl cache hits (microseconds)        811
> 7-JUN-2022 00:33:33.57    Ave Disk I/O Resp Time incl cache hits (microseconds)        912
> 7-JUN-2022 00:34:37.84    Ave Disk I/O Resp Time incl cache hits (microseconds)       1013
>
> This system has been 'normal' for the last 6 years; I have not any idea why this has started now.  The disk isn't fragmented to any degree; all 'working' storage is only other drives.  There are no hardware faults - MSA$UTIL doesn't show anything out of the ordinary, etc.  I have forced a system crash in the hopes that SDA would reveal something but that hasn't turned up anything.  I have to assume something is going on in the hardware as I am at a loss to explain this from the O/S side.
>
> Other blades with identical hardware/software/firmware and similar workloads have never shown this problem after a similar shutdown/power-outage/reboot scenario.
>
> Any ideas?

Do the figures go back to normal _immediately_ after a reboot without
any power-off cycling or do you need to power off the machine and leave
it off for a while before they go back to normal ?

I am wondering if there's some subtle thermal-related issue due to
damage caused when the power outage occurred.

Simon.

-- 
Simon Clubley, clubley at remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.



More information about the Info-vax mailing list