[Info-vax] Unpleasant Disk Shadowing Surprise
John Wallace
johnwallace4 at yahoo.co.uk
Wed Oct 12 14:52:51 EDT 2011
On Oct 12, 6:50 pm, Kenneth Fairfield <ken.fairfi... at gmail.com> wrote:
> On Wednesday, October 12, 2011 10:09:20 AM UTC-7, tadamsmar wrote:
> > On Oct 12, 12:02 pm, Kenneth Fairfield <ken.fa... at gmail.com>
> > wrote:
> > > On Wednesday, October 12, 2011 1:11:42 AM UTC-7, tadamsmar wrote:
>
> > > [...]> I looked at SHOW DEVICE DK. I now see that I did not read it
> > > > correctly earlier, and I have been confused about some things.
>
> > > > The real sequence of events is that DKA100 went offline and DKA0
> > > > logged an error at almost exactly the same time! Those are my shadow
> > > > set. Hopefully that was a soft error on DKA0 so that my data is not
> > > > compromised, I need to double check that.
>
> > > > I rebooted just before I left work and DKA100 did not even show up
> > > > after the reboot.
>
> > > > I am now thinking that I probably have a bad SCSI cable. I will
> > > > replace that tomorrow, I think I have one in storage.
>
> > > [...]
>
> > > I find your error analysis logic, hmmm, "unusual".
>
> > > The first event was that DKA100 went offline. The
> > > second was that it didn't show up on reboot. I'd
> > > call that a bad disk.
>
> > > Why (in the world) do you jump to the conclusion that
> > > it's the SCSI cable??? You do understand that the same
> > > cable is used to access DKA0 as DKA100, and if there
> > > were a problem with the cable, you probably wouldn't
> > > be able to boot at all (or at least, without errors)
> > > off DKA0?
>
> > > Go replace DKA100 and get on with your life. :-)
>
> > > -Ken
>
> > I don't see how a failure of DKA100 could have caused
> > an error to be logged on DKA0. Please explain your
> > reasoning.
>
> OK, this is getting ridiculous: "You show me yours, then
> I'll show you mine" ??? You didn't answer my question.
> Nor VAXman's earlier in this thread. Sigh...
>
> For one thing, you write a lot of prose but show almost
> no actual evidence, as in a cut-n-paste of SHOW ERROR,
> SHOW DEVICE DKA, or the output of ANAL/ERROR. It's all
> descriptive and imprecise.
>
> This is the first I recall in this lengthy thread
> (granted I could easily have missed it) that you've
> mentioned an error on DKA0. OTOH, you *have* rebooted
> the system since the disk failure so anything logged
> against DKA0 now is most likely unrelated to the
> initial failure. Is the bad disk (DKA100) still on
> the bus?
>
> I will note that with HSG-hosted disks ($1$DGA), ISTRC
> that every mount generates one error count in SHOW
> DEVICE agaist the disk being mounted. I don't recall
> that for directly attach SCSI disks, but I have far
> less access to VMS systems now than even a year ago. :-(
>
> The only way you'll have a clue whether the error logged
> against DKA0 is significant is to get the full output from
> (the current incarnation of) ANAL/ERROR for that disk.
>
> -Ken
Quite. Total lack of real hard unambigous uninterpreted facts. How
difficult can a cut and paste from SHOW DEV or the error log be?
I'm a little bit surprised no one's asked whether the drives are
genuine qualified DEC drives, rather than some generic SCSI stuff. As
others used to point out, there's no Standard in SCSI.
Whilst unqualified unsupported drives may do the right thing most of
the time, if data availability and integrity are of interest, it's
often worth paying the extra for a drive that does what VMS expects
when the drive sees an error (internally or on the bus).
More information about the Info-vax
mailing list