[Info-vax] Are queue manager updates written to disk immediately ?
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Fri Apr 12 08:54:25 EDT 2013
On 2013-04-11 15:27:06 +0000, Simon Clubley said:
> When a batch job completes, does the queue manager _immediately_ write
> that information to it's data structures on disk or is that information
> cached in memory for a short time first ?
>
> I had a problem this morning which I have not seen before and it looks
> like VMS is not _immediately_ writing queue updates away to disk, which
> to put it mildly is bl**dy dangerous if true.
This reply will not answer your question, nor identify the culprit
software or hardware here.
You'll want the answer from HP, whatever that might be.
The queue manager has occasionally lost data or corrupted data, so a
check of the patch status is certainly warranted. HP will undoubtedly
ask about that, too.
While I don't recall the details of the queue manager implementation
off the top and whether it's using careful updates or caching data for
whatever reason, VMS has classically tried to use so-called careful
updates; ordering the writes so that partial or bad data doesn't end up
available if/when a crash or power outage arises. You get either the
whole update, or nothing. These careful updates are tricky to
implement. This approach is also a great and wonderful scheme, right
up until you meet your first caching controller, or your first caching
disk. Particularly if the caching controller or the caching disk
misrepresents the state of the data back up to the host, or if the
cache batteries don't ride over the outage, or if the controller gets
reset and loses its marbles and loses the data.
Rummage around for an OpenSolaris mailing list posting from Jeff
Bonwick (then working on ZFS at Sun) from around October, 2008 for some
related details on what ZFS was encountering with some SATA gear.
(OpenSolaris is the predecessor of what's know now as illumos, and used
in OpenIndiana and related.) Here's an excerpt from that post:
"FYI, I'm working on a workaround for broken devices. As you note,
some disks flat-out lie: you issue the synchronize-cache command, they
say "got it, boss", yet the data is still not on stable storage. Why
do they do this? Because "it performs better". Well, duh -- you can
make stuff *really* fast if it doesn't have to be correct.
Before I explain how ZFS can fix this, I need to get something off my
chest: people who knowingly make such disks should be in federal
prison. It is *fraud* to win benchmarks this way. Doing so causes real
harm to real people. Same goes for NFS implementations that ignore
sync. We have specifications for a reason. People assume that you
honor them, and build higher-level systems on top of them. Change the
mass of the proton by a few percent, and the stars explode. It is
impossible to build a functioning civil society in a culture that
tolerates lies. We need a little more Code of Hammurabi in the storage
industry."
Years ago, DEC had some SCSI configurations with batteries right in the
StorageWorks storage shelves, intended to allow the shelves and disks
to complete multiblock writes that might have been in flight. The
general problem with not getting all the data written to non-volatile
storage has only gotten more complex in the years since then, with
caching controllers (particularly with bad RAID batteries or no
batteries), and with caching drives, and the quest for higher and
higher performance I/O.
If this queue manager misbehavior is a sufficient issue for you,
consider getting yourself a Less-Interruptible Power Supply (LIPS, as
I've never met a truly uninterruptible power supply) for the system.
And as others have mentioned, add some checks against a job that really
can't run twice. (I've seen a few of these cases in clusters, when the
cluster time was skewed among hosts. Your "tomorrow+08:20" should have
avoided problems from the usual minor skews, unless the time in the
cluster — on the host that was running the queue manager, which is not
necessarily the host that was running the batch job — was very skewed.)
I've ended up with a batch scheduler for these and related tasks.
What's available for process control and process management using the
default VMS mechanisms and APIs is very low-level, unfortunately, and
what starts out as DCL and related baggage inevitably gets unwieldy as
cases and updates are added to the code; best to either acquire a
scheduler, or roll your own properly. This also gets into having a
transactional database, another of my "peeves" about application code
"rolling its own". Journaling and the rest are all because the power
and the hardware can be somewhere between untrustworthy and, well, see
Jeff's posting...
As for your question, donno. But this stuff can be (is) more
complicated than it looks.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list