[Info-vax] Current VMS engineering quality, was: Re: What's VMS up to these

Sat Mar 17 07:13:52 EDT 2012

On Mar 17, 1:26 am, glen herrmannsfeldt <g... at ugcs.caltech.edu> wrote:
> Fritz Wuehler <fr... at spamexpire-201203.rodent.frell.theremailer.net> wrote:
> > Johnny Billquist <b... at softjar.se> wrote:
> >> 2. Unix distributed networks using ethernet and shared disks is not
> >> robust at all. You must be totally uninformed if you claim this. Have
> >> you ever used a machine with an NFS root? Any time the server stopped,
> >> rebooted, or whatever, all clients *freeze*. Not even rebooting, unless
> >> you press the power switch. You just sit there waiting for the NFS
> >> server to wake up again.
> > Correct. This just happened to me (facepalm) today on a modern Linux system
> > 2.6.29.something kernel. I didn't think and took my NFS box offline and when
> > my Linux client couldn't get to the mounted share ..........................
> > Solution: reboot NFS box. Stupid, stupid, stupid. Can't the UNIX idiots
> > *ever* do anything correctly?
>
> If you don't like it, use a soft mount, otherwise that is considered
> correct.
>
> If you are writing to a disk, and the disk doesn't respond fast enough,
> you don't normally expect the system to just throw away the data you
> thought you wrote, do you?
>
> Why would you expect that in the case of an NFS disk?
>
> As previously mentioned, the result is data loss.
>
> Which reminds me, also, of how many C programmers don't check
> the return values from I/O function calls, especially fclose().
>
> As fclose() has to flush the buffers, there is a good chance that
> problems writing will result in fclose() returning an error code,
> and if you ignore it you won't know that the data wasn't written.
>
> -- glen

"If you are writing to a disk, and the disk doesn't respond fast
enough,
you don't normally expect the system to just throw away the data you
thought you wrote, do you?"

Are you including the behaviours you expect when "fast enough" happens
to mean "before the power fails", or (in the case of something like a
shared-SCSI cluster) "before an unsolicited SCSI reset occurs"? For
both user data and filesystem metadata? In which case it wouldn't be a
surprise at all (to me) if there were some corner cases where data was
thrown away unless explicit precautions had been taken in user or
filesystem code.