[Info-vax] Unpleasant Disk Shadowing Surprise
Richard B. Gilbert
rgilbert88 at comcast.net
Wed Oct 12 14:30:23 EDT 2011
On 10/12/2011 12:33 PM, JF Mezei wrote:
> Bob Koehler wrote:
>
>> If your definition of real-time can't handle 3 minutes of
>> interruption, then you probably need to engineer a different solution
>> than the kind of shadowing approach you're using now.
>
> I seem to recall being told that VMS would seamlessly continue to run
> after the loss of a disk.
IF you have implemented some variety of RAID it's easy and you should
take care to check your error logs regularly. If you don't pay
attention you won't realize that you are in trouble until a second disk
in the array fails! Then you are "up the proverbial estuary and lacking
the customary means of locomotion"!
I made a habit of "walking the machine room" each workday. Those little
yellow lights on the front of the drive carrier were hard to miss! The
problem was easily fixed; pop the failed drive out, and plug in a
replacement. Call DEC/Compaq/HP Field Service and say "I have a dead
RZxx disk, give them my contract number. A couple of hours later a
courier shows up with a replacement drive and hauls away the failed
drive. The replacement goes in my "stash" until needed.
>
> Perhaps it is expected that the mount verification (which Rob Brooks
> confirmed would kick in) would complete in a second or two.
>
> My guess is that one disk's failure ruined the SCSI bus and prevented
> VMS from talking to the other disk until the first one died and allowed
> SCSI to function.
>
> For proper fault tolerance, you would want to have 2 SCSI controllers.
I think you would also want software RAID!
More information about the Info-vax
mailing list