[Info-vax] Eisner? Down? (10 days later)

johnhreinhardt at yahoo.com johnhreinhardt at yahoo.com
Sun Jan 4 22:40:03 EST 2009


On Jan 4, 5:13 pm, "DeCoy" <dalee... at obfuscation.att.net> wrote:

> I'll briefly quote Stephen here, in case somebody can spot something:
>
> <quote>
> In some of tries of the last experiment, the system console came alive
> and attempted to boot VMS.  You can see the last of the output in the
> screen shot athttp://www.Arnold.com/Photo_123108_002.jpg.  Note the
> message "SCSI drive at channel 2, target 1 dead".  The message about CPU
> 01 starting are not relevant.  Nothing else happens and there is no disk
> activity after that message.
>
> From this point I was able to ^P and give console commands.  Since
> storage is the problem, I entered "show disk".  The output is in the
> screen shot athttp://www.Arnold.com/Photo_123108_003.jpg.  DRA0-5 are
> drives defined on the seven-member RAID set in the built-in shelf, on
> channels 1-2 of the Mylex controller.  DRA6 is on the six-member RAID
> set in the external gray storage shelf.  (We also have the FTP
> repository on large ATA drive connected via an ATA-SCSI adpater: DKA0.)
> <end of quote>
>

Just for clarification of the configuration - you've got a DS20 with a
Mylex (swxcr) scsi RAID controller using 2 channels on a split SCSI
bus on the stack of 7 drive slots going up the right side.  These 7
drives are 9GB bricks in a RAID-5 and from that you've sliced it into
6 logical drives DRA0-5.  The drive in slot #1 on the second bus has
failed so the swxcr has marked DRA0-5 as "DEGRADED".  Then the 3rd
channel of the controller is used for DRA6 on an external shelf.

I've seen the "%DRA, drives=0, optimal = 4294967290, degraded = 6,
failed = 0 " before.  I think it's a combination of a firmware bug and
a slightly confused controller.   I wish I could remember exactly what
I did to fix it but it was a few years ago when I had the problem.

I would assume that you've tried replacing the bad disk mentioned with
another?  The swxcr may not automatically do a rebuild.  You may have
to go into the swxcrmgr utility and tell the controller that the drive
has been replaced, mark it as good and then it to rebuild the array.

Did the external shelf become disconnected or lose power?  It doesn't
look like the swxcr sees it at the moment. That might be enough to
hang it, though if that's the case is should show as failed.  I'm not
sure why there is no mention of it in the startup messages.  It may be
irrelevant if the DRA0-5 array is rebuilt.

I know you're just passing on info second hand, but a little more info
on what's been tried and failed and what the failure messages/results
were would help.




More information about the Info-vax mailing list