[Info-vax] best guess for mount-verification problem
abrsvc
dansabrservices at yahoo.com
Mon Jun 28 07:05:36 EDT 2021
On Monday, June 28, 2021 at 6:57:43 AM UTC-4, Phillip Helbig (undress to reply) wrote:
> In article <sbc8uu$hn9$1... at gioia.aioe.org>,
> hel... at asclothestro.multivax.de (Phillip Helbig (undress to reply))
> writes:
>
> > In article <sbc86h$656$1... at gioia.aioe.org>,
> > hel... at asclothestro.multivax.de (Phillip Helbig (undress to reply))
> > writes:
> >
> > > I have a three-node cluster (when no satellite or test system has joined
> > > it) and physical disks (blue SBB in BA356) on each node (no dual-ported
> > > disks; each disk has a direct connection to only one node).
> >
> > > When something fails, I just replace it with something of similar build.
> > > (The main reason for moving to SBB disks was to be able to replace a
> > > disk (the most common failure) without having to dismount the members it
> > > hosts, shut down the system, remove it from the shelf, open it, replace
> > > the disk, close it, put it back on the shelf, boot it, remount the
> > > members it hosts.)
> > >
> > > For a while now I've noticed disks going in and out of mount
> > > verification. It is clear which node is involved. So, my plan is to
> > > replace hardware (and maybe try to find the problem when the hardware is
> > > out of the cluster) and hope that it goes away.
> >
> > > Theoretically it could be the SCSI cable, but my guess is that it is
> > > either the expansion box or the SCSI card. (I have had one expansion
> > > box fail, but it failed completely.) Which is more likely?
> OK, spent some time staring at hardware in the cellar. :-| It seems
> that before the mount verification sets in, the two LEDs to the left of
> the plug in the power supply go out, then come back on, then all the
> disks light up briefly. So probably a problem with the box or the power
> supply.
>
> I can try replacing the power supply first, if that doesn't help then
> the SCSI interface at the top, then if that doesn't help the entire box.
>
> Any other ideas?
If all of the drives on that controller go through verification at the same time, then yes the common points are the controller itself, the controller in the SBB box and the power supply. Given that the lights change, I would look to the supply first. In the last 15 years of supporting a site with 12 of these, I have seen 1 power supply fail and 2 Alpha disk controllers fail (not including individual disks). At another client site, I did see one of the BA box controllers fail (BA35X-DA).
I do have spares of the BA35X-FA (dual scsi version of the -DA). I am in the US so I don't know the cost of shipping,but i have more than 10 of these if it helps you out.
Dan
More information about the Info-vax
mailing list