[Info-vax] Are queue manager updates written to disk immediately ?
Simon Clubley
clubley at remove_me.eisner.decus.org-Earth.UFP
Thu Apr 11 13:35:23 EDT 2013
On 2013-04-11, Jan-Erik Soderholm <jan-erik.soderholm at telia.com> wrote:
> Simon Clubley wrote 2013-04-11 18:28:
>> On 2013-04-11, Jan-Erik Soderholm <jan-erik.soderholm at telia.com> wrote:
>>>
>>> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>>
>> There's no accounting entry for the first job even though there is
>> a full logfile, but since accounting log updates are buffered, then
>> that's not really a major surprise.
>>
>>> Nothing weird with the start/finish timestamps in accounting?
>>>
>>> I would look mare at something with the system clock at
>>> startup that made the holding job to be released. Such
>>> as a startup with wrong time setting or similar.
>>>
>>
>> All the timestamps on the log files and accounting are correct; the only
>> time the system clock is corrected on this system is once a day during
>> during the night from a NTP source and that job had already run a couple
>> of hours previously.
>
> But there was a reboot in between, not ?
>
Sequence:
NTP job (06:30) -> check_queues (08:20) -> power failure ->
check_queues run again (08:25).
>> In addition, the queue entry number from the accounting record for the
>> second job didn't match the entry number from the submit command in
>> the first job.
>
> Higher/later or lower/earlier?
>
> Does it match the entry number from *any* previous job ?
> I don't know ho long you keep logs, of course... :-)
>
I also have the log file from yesterday's run (10-Apr-2013 08:20).
The entry number from yesterday's submit command matches the entry
number in the accounting log for the second run today.
So the queue manager has indeed run the job submitted yesterday twice
to completion today.
> Sounds like the quemgr thought that the job hadn't been
> run (or hadn't completed) and simply restarted it.
>
> But then, will a restarted batch job not run using the
> original entry number? Maybe not...
>
> I would expect the queue database to be updated syncronisly
> at the time of batch job "rundown". That is where SHOW QUEUE
> looks, not?
>
That's _exactly_ what I would expect as well, and on disk as well;
not just in memory.
Even if there was some queue manager bug caused by a power failure
during some unusual tight timing window of a few milliseconds [*],
that still does not explain the disappearing job from the submit
command in the first job.
Simon.
[*] A few milliseconds maximum, because don't forget I have a _full_
logfile from the first run of the job today.
--
Simon Clubley, clubley at remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
More information about the Info-vax
mailing list