[Info-vax] Long uptime cut short by Hurricane Sandy

David Froble davef at tsoft-inc.com
Fri Jan 25 23:28:51 EST 2013


AEF wrote:
> On Jan 25, 2:48 pm, Stephen Hoffman <seaoh... at hoffmanlabs.invalid>
> wrote:
>> On 2013-01-25 19:31:20 +0000, AEF said:
>>
>>> On Jan 25, 2:47 pm, Stephen Hoffman <seaoh... at hoffmanlabs.invalid>
>>> wrote:
>>>> On 2013-01-25 18:34:48 +0000, Bill Gunshannon said:
>>>>> And less than a month ago we heard right here:
>>>>>         >> Long server uptimes are the antithesis of testing.
>>>>> Go figure....
>>>> And you'll hear it again.   That, and the benefits of testing your DT.
>>> DT?
>> Disaster Tolerance.
> 
> Hi Hoff!
> 
> Uh, I missed this part. There is no DR for these systems. There
> doesn't need to be. When Hurricane Sandy hit, the building lost power.
> We didn't get stable power back until about Jan 4. Nobody missed the
> VAXes except for me.
> 
> I did have to manually update our Application Owner Table once, but
> that's my problem and I'm okay with it. And that's pretty much all I
> use it for. I use it for this a few times a year. So there's no need
> to do any Disaster Recovery setup, testing, etc. (I wrote some DCL to
> convert some Excel exports to wiki format to post on our Confluence
> wiki. I have to discard extra columns, massage the data and such, and
> output it in wiki format.) The boss wouldn't have allowed me to do any
> DR setup for this anyway. It's not worth it.
> 
> Now, I *did* do DR work for the apps I'm responsible for: JIRA and
> Confluence. They run on Unix systems. I set up the daily backup-to-the-
> DR-site routine and developed a procedure. We tested it a while back
> and it worked fine. We knew the storm was coming so I checked things
> again right before.
> 
> We recovered everything save perhaps less than one day's worth of
> attachments (which we deemed acceptable ahead of time).
> 
> But the VAXes I mentioned were not missed, I'm sorry to say (except
> for my needing to do one manual AOT update.) (The Financial Crisis
> killed my trading desks, which is what the VAXes were used for. Nice.)
> 
>>> I don't know. People used to brag of long uptimes here.
>> Yep.  I thought it was cool, too.  Then I thought of the implications
>> of what it meant.
>>
>>> Usage of these systems has been stable to the nth degree. No hardware
>>> changes. No software changes. Almost zero use and not in the least bit
>>> critical.
>> Uh-huh.
> 
> I said they're not the least bit critical.
> 
>> Have you not noticed the flurry of reboot problems with some subset of
>> Alpha systems?
> 
> Nope. I don't read much in this group anymore. I just peek in from
> time to time. And these are VAXes, not Alphas. (^_^)
> 
>> If you don't test it, you can't be sure it'll work.
> 
> That's okay if it doesn't work in a disaster. It just makes it a
> little harder for me to update the AOT when the big boss sends me a
> new spreadsheet. No data is lost. All the important data on all of the
> VAXes are backed up on tape at Recall and on disks across "The Pond"
> in London. And most of it's over 7 years old, which we no longer need
> to keep. So I'm okay.
> 
>>> I've been the only user on the first box during this time period. I
>>> use it for occasional tasks that could be done elsewhere if really
>>> needed. OK, some monitoring for a short while, which was also stable.
>>> No one's used the second box at all for almost as long.
>>> Don't judge without all the facts.
>> So.  Will your box reboot automatically?  Are you feeling lucky?  Do
>> you have all the facts?
> 
> No. The battery is dead, so it will ask for the time. Also, I don't
> need it to come up automatically.
> 
> Yes. (These are VAX systems!) OK, my primary worries here are that the
> power supply might go kablooey. I've lost a few, given we had as many
> as 40 MicroVAXes on line in my early days at the company. And a
> handful of disk drives bit the dust. And we had local backup systems
> and across-the-Pond DR systems, and that saved us a couple of times.)
> But I have the data backed up, and I have lots of spare disks and
> power supplies from the several dozen VAXes we have sitting around. I
> believe I'm okay.
> 
> Yes (see above).
> 
>> Uptime looks great.  On paper.  Then you slam into reality.  Mistakes
>> happen.  Latent bugs become less-than-latent.  That's why we test.
> 
> Agreed. You may also need to AUTOGEN once in a while.
> 
> OK, I'm running VMS 6.2 with all relevant ECO kits applied. Can you
> give an example of a latent bug that might hit me? I have no apps
> running. I just use my DCL script once in a while and do an occasional
> backup. Thanks!
> 
>> --
>> Pure Personal Opinion | HoffmanLabs LLC
> 
> AEF

I used to have my customers re-boot at least once a month.  It wasn't 
needed, but it didn't hurt.

Uptime is basically an ego trip.  Thing is, VMS can do it, some others 
can't.  Weendoze can if you don't run anything.

If you got a MicroVAX 3100 model 98 sitting around that you're not fond 
of, I could give it a good home.



More information about the Info-vax mailing list