[Info-vax] rule of thumb for replacing bad disks based on error count
Phillip Helbig---undress to reply
helbig at astro.multiCLOTHESvax.de
Mon Jul 11 14:51:59 EDT 2011
Obviously, if I see the error count increasing quickly on a physical
disk, I will replace it. My hope is that until I do so, HBVS will keep
my data safe. (For really important shadow sets, I have 3 members; for
others, 2.) But what about for SLOWLY increasing error counts? And
what about errors on the shadow set itself, rather than on the members?
Obviously, physically bad sections of a disk can cause errors, but what
are other causes of error on physical disks and on shadow sets?
Is there any reason to suspect the physical disks, as opposed to
controllers, cables etc, if the error count increases on a shadow set?
Can I assume if the error count increases on only one node, then there
is no danger of data becoming lost (presumably because the problem
cannot be on the disks, otherwise it would be visible on all nodes)?
What is a good rule of thumb for replacing disks (in shadow sets) based
on error count?
More information about the Info-vax
mailing list