[Info-vax] Backup's processing of directory files
Richard B. Gilbert
rgilbert88 at comcast.net
Mon Nov 1 20:17:17 EDT 2010
On 11/1/2010 6:54 PM, Alan Frisbie wrote:
> On 10/29/2010 10:39 AM, Alan Frisbie wrote:
>> My user disk (SCSI) has developed a bad block
>> (parity error when trying to read it). This block
>> is right in the middle of my 15,000 block MAIL.DIR
>> file. :-( Needless to say, this puts a crimp in
>> mail processing.
>
> As it turned out, the drive is toast, at least as far
> as I am concerned.
>
> In addition to the backups of the previous few days, I
> was able to do another one Friday morning. The only
> errors in the log were the same ones, all because of the
> one bad block in the directory file.
>
> Since the weather has cooled down a bit, I powered on
> the MSA1000 and restored the tape to a spare drive
> (RAID-1 set). It completed without errors. However,
> Backup restored the directory file with the same bad
> block in it. The only difference was that it was
> readable, but with garbage data.
>
> I then set the file /NoDirectory and deleted it.
> Analyze/Disk/Repair then recovered all the files into
> [SYSLOST]. I created a new directory file and renamed
> all the .MAI files into it. Everything was now back
> to normal.
>
> However, I wanted to see if I could do the same trick
> with the original drive, just for fun/education.
> Not wanting to take a chance of the bad block getting
> reused, I did not delete the bad directory file, but
> just renamed it. I then ran Analyze/Disk/Repair and
> went to bed. Saturday morning it was still running
> with continual errors about not being able to enter
> the lost files into [SYSLOST] because of a parity
> error in SYSLOST.DIR. Yup, the drive is toast.
>
> When a drive with an error gets another one during
> a repair attempt, it is time to give up. To quote
> Hoff's advice in http://labs.hoffmanlabs.com/node/838
> "Get your data off of the device right now" (as I did).
>
> Fortunately I had a spare drive, so I installed it
> Saturday morning. The only problem was convincing
> the cat to give up her favorite perch on top of the
> XP1000. :-)
>
> Thanks to everyone who provided valuable hints and
> suggestions.
>
> Lessons learned:
>
> 0. Actually *do* regular backups (OK here)
> 1. Test all error paths in my Backup scripts (Oops!)
> 2. Do "SHOW ERRORS" every so often, just in case (Oops!)
> 3. Keep spare hardware on hand, because failures happen. (OK here)
>
> Alan "The Other AEF" Frisbie
In a former life, I had a DCL procedure that I called
"MORNING_CHECK.COM". Its purpose in life was to look for things that I
would want to know about. It searched log files looking for "-W-",
"-E-", and "-F-". It compared today's error count with yesterday's.
If there was a difference, I was notified. ISTR that there was a bit
more than that in the script but I hope you get the general idea.
I don't know if I still have a copy of this script. Getting at it would
by somewhat awkward; my faithful workstation has died with a failed
power supply. I could put the disk in another machine but that's too
much like work!
More information about the Info-vax
mailing list