[Info-vax] Eisner? Down? (10 days later)

Richard B. Gilbert rgilbert88 at comcast.net
Sun Jan 4 23:38:45 EST 2009


G Cornelius wrote:
> DeCoy wrote:
>> Thanks, George.  The problem appears to be storage-related, perhaps with the 
>> RAID array on the Mylex controller, and perhaps with either controller 
>> hardware or controller configuration.
>>
>> Expertise in diagnosing (and perhaps fixing) Mylex controller symptoms would 
>> be initially useful.
> 
> I won't be of much help - my experience is with the HSJ/HSZ/HSG controller
> series.
> 
> I did leave voice mail for Steve offering my services, but you folks
> will probably do better diagnosing it remotely than me trying to get
> involved.  Let me know, though, if I can do something, even if it's
> just getting him some spare parts.
> 
> Coincidentally, the reason I am not using the DS20 that's in my garage
> is that the Mylex (KZPBC?) controller failed when I was trying to configure
> it and I have not yet sprung for a replacement or stuffed in a non-raid
> SCSI card.
> 
> I know of others around here who have used the Mylex controller and
> have encountered some of its quirks.  I seem to remember helping someone
> on the research side of things restore a backup of what was at the time
> a large (30GB) raid volume that was lost due to Mylex controller issues,
> or perhaps due to not noticing that a raid disk had failed until a
> second failure made recovery impossible.
> 

It seems to me that it's a SYS$MANGLER's JOB to notice things like 
failing disks.  I had a batch job called "MORNING_CHECK" that ran every 
day at  07:30.  It compared the output of "SHOW ERROR" with the output 
from yesterday.  It checked log files for errors ("-E-" and -F-"), etc, 
etc.  If it found something that looked like a problem I was notified by 
a text message to my pager.  This gave me time to work on the problem 
before it turned into a crisis!

A failed disk was not allowed to become a problem!  I would swap it out 
with a spare and call DEC/Compaq/HP to pick up the dear departed and 
bring me a replacement drive.

In fact, thanks to MORNING_CHECK, I usually found disks that were 
developing problems before the problems developed fully.  One error was 
allowed but when a disk started logging multiple errors, I swapped it 
out with a spare and called for a replacement.  The same guy who fetched 
replacements for field service would fetch me a new one and I gave him 
the dear departed!



More information about the Info-vax mailing list