[Info-vax] Unpleasant Disk Shadowing Surprise
abrsvc
dansabrservices at yahoo.com
Tue Oct 11 15:38:44 EDT 2011
On Oct 11, 3:04 pm, tadamsmar <tadams... at yahoo.com> wrote:
> On Oct 11, 1:12 pm, Jan-Erik Soderholm <jan-erik.soderh... at telia.com>
> wrote:
>
>
>
>
>
> > tadamsmar wrote 2011-10-11 18:49:
>
> > > On Oct 11, 12:13 pm, Kenneth Fairfield<ken.fairfi... at gmail.com>
> > > wrote:
> > >> You don't say what your storage configuration is.
> > >> Are the shadow members both internal disks in the
> > >> DS10, or are they on an external controller?
>
> > >> Reason I ask is that several years ago, on some
> > >> HSJ-hosted storage (IIRC, otherwise it could have
> > >> been on HSGs), one disk in a shadow set started
> > >> logging a large number of errors, on the order of
> > >> several hundred per minute. Unfortunately, the
> > >> controller went to heroic efforts to recover!
>
> > >> As a result, the shadow set was functionally
> > >> inaccessible. (Well, it was a bit more complicated
> > >> than that as I think we first tried copying in
> > >> a 3rd member per our standard procedures, but
> > >> that only made the problem worse.)
>
> > >> In the end, we had to just yank the bad disk out.
> > >> The controller was determined *not* to drop the
> > >> bad member.
>
> > >> So... What do you system error logs show for the
> > >> bad disk? What was the error count on that member
> > >> before it was dropped?
>
> > >> With the HSJs, I think Compaq determined there was
> > >> some setting that we could apply that would keep
> > >> the controller from working so hard to recover.
> > >> Without knowing your storage configuration, there's
> > >> no way to say whether something similar applies.
>
> > >> However, watching disk error counts is *very*
> > >> important in all cases.
>
> > >> -Ken
>
> > > The shadow set is just the two internal disks of the DS10
> > > configuration.
>
> > > I was watching the error count. That's how a know we had a single
> > > error
> > > this morning, I had checked for disk errors only an hour before the
> > > event.
>
> > > I am going to start analyzing the error log.
>
> > > You guys have convinced me that this is not normal and may be an
> > > indicator
> > > of something more than a typical disk error.
>
> > > Thanks for your input.
>
> > So both members of the shadowset are on the same SCSI controller and the
> > same SCSI bus. One disk can play havoc with the SCSI-bus, I guess.
> > Effectively blocking any access to the other shadow member.
>
> > For a DS10 in a critial application, I'd realy recomend some external
> > box(es), preferable on two separate SCSI controllers and shadowing
> > between the boxes. This is realy sheap on the second-hand market today.- Hide quoted text -
>
> > - Show quoted text -
>
> Can you point me to some specify items or vendors?
>
> We might just replace the single SCSI card and the disks, but I want
> to explore other options.
>
> (The SCSI card had errors too.)- Hide quoted text -
>
> - Show quoted text -
I use the KZPEA dual controller card along with a KZPCA single
controller. with external disks in BA356 trays. These are the
external SBB "bricks". I have 6 DS10s set up this way with no
problems. Please note that the "supported" system only allows one
single and one double controller. You most likely have the single
controller installed.
Note that these drives and trays are old and no longer available new.
There are suppliers of refurbished units. I have been lucky to have
purchased drives etc. from Ebay for less money, but you take a chance
there. Expect to pay around $150 per controller and $75 or more per
disk. The cables are available from Ebay for less than $20 usually.
(BN37 amd BN38)
Dan
More information about the Info-vax
mailing list