[Info-vax] rule of thumb for replacing bad disks based on error count

onedbguru onedbguru at yahoo.com
Mon Jul 11 21:20:36 EDT 2011


On Jul 11, 6:33 pm, John Wallace <johnwalla... at yahoo.co.uk> wrote:
> On Jul 11, 8:44 pm, "Richard B. Gilbert" <rgilber... at comcast.net>
> wrote:
>
>
>
>
>
>
>
>
>
> > On 7/11/2011 2:51 PM, Phillip Helbig---undress to reply wrote:
>
> > > Obviously, if I see the error count increasing quickly on a physical
> > > disk, I will replace it.  My hope is that until I do so, HBVS will keep
> > > my data safe.  (For really important shadow sets, I have 3 members; for
> > > others, 2.)  But what about for SLOWLY increasing error counts?  And
> > > what about errors on the shadow set itself, rather than on the members?
>
> > For SLOWLY increasing error counts, I'd have a look at the error log to
> > see what's going on.  If you are getting errors, your hardware is trying
> > to tell you something.  Sometimes, it's telling you REPLACE ME ASAP!
>
> > You had better be paying attention!
>
> > If you can't interpret the errorlog, your Field Service Engineer can.
> > Yes, he, or she, costs money.  If your system is business critical,
> > spend the money!
>
> > > Obviously, physically bad sections of a disk can cause errors, but what
> > > are other causes of error on physical disks and on shadow sets?
>
> > Users!  I recall a user many years ago who took a two week vacation and
> > left a program running the filled a disk directory with several
> > thousands of small files!  All those files were cataloged in the SAME
> > DIRECTORY.
>
> > Take my word for it, a directory with fifteen thousand files does NOT
> > WORK QUICKLY or well!
> > <snip>
>
> A directory with too many files may well lead to poor performance, but
> under what circumstances do you think it will lead to an increase in
> the device error count on a physical disk?


And a directory on Linux with 100K files will crash a running Oracle
db (audit logs or trace files).



More information about the Info-vax mailing list