[Info-vax] Unpleasant Disk Shadowing Surprise
tadamsmar
tadamsmar at yahoo.com
Tue Oct 11 12:49:40 EDT 2011
On Oct 11, 12:13 pm, Kenneth Fairfield <ken.fairfi... at gmail.com>
wrote:
> You don't say what your storage configuration is.
> Are the shadow members both internal disks in the
> DS10, or are they on an external controller?
>
> Reason I ask is that several years ago, on some
> HSJ-hosted storage (IIRC, otherwise it could have
> been on HSGs), one disk in a shadow set started
> logging a large number of errors, on the order of
> several hundred per minute. Unfortunately, the
> controller went to heroic efforts to recover!
>
> As a result, the shadow set was functionally
> inaccessible. (Well, it was a bit more complicated
> than that as I think we first tried copying in
> a 3rd member per our standard procedures, but
> that only made the problem worse.)
>
> In the end, we had to just yank the bad disk out.
> The controller was determined *not* to drop the
> bad member.
>
> So... What do you system error logs show for the
> bad disk? What was the error count on that member
> before it was dropped?
>
> With the HSJs, I think Compaq determined there was
> some setting that we could apply that would keep
> the controller from working so hard to recover.
> Without knowing your storage configuration, there's
> no way to say whether something similar applies.
>
> However, watching disk error counts is *very*
> important in all cases.
>
> -Ken
The shadow set is just the two internal disks of the DS10
configuration.
I was watching the error count. That's how a know we had a single
error
this morning, I had checked for disk errors only an hour before the
event.
I am going to start analyzing the error log.
You guys have convinced me that this is not normal and may be an
indicator
of something more than a typical disk error.
Thanks for your input.
More information about the Info-vax
mailing list