[Info-vax] Beyond Open Source
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Sun May 10 18:26:19 EDT 2015
On 2015-05-10 20:42:19 +0000, johnwallace4 at yahoo.co.uk said:
> On Sunday, 10 May 2015 18:48:53 UTC+1, Stephen Hoffman wrote:
>> On 2015-05-10 17:28:50 +0000, johnwallace4 at yahoo.co.uk said:
> [huge snippage for brevity]
>> --
>> Pure Personal Opinion | HoffmanLabs LLC
>
> And how are people supposed to be identifying "the relevant patches",
> given the limited information provided with them?
The goal is to get folks out of doing that. It's to automate that.
Heroic measures — as you've described earlier — isn't how you want a
server to be managed.
Some patches are known and the selection process is easy — any
mandatory install, and any optional install where the criteria are met
— the particular prerequisite software or device or other details.
For other patches, there'd be a trigger such as a specific failure —
but again, if you're going to incur a failure or a crash, that's a
patch you'd probably want to install.
In years past, the patches acquired a poor reputation because they
introduced errors. If that misbehavior arises anew — recent UPDATE
patches have been solid — then there's another and bigger issue lurking
here, and end-user patch management and patch distribution is not going
to address it.
> Or are they supposed to take it on trust that everything out of Redmond
> or Cupertino (?) is inherently safe and trustworthy?
Cryptographic checksums. As for patches, poor patch quality spooks
folks, and delays or defers patches. Better patch quality means
quicker roll-outs. Hopefully the end-users or the partners have a test
environment, and the tools to match. As your experience with that
customer shows, not everybody does, though.
> "we should then design the environment to upload and scan the crashes
> autonomously, and that we can and should lead the end-users toward the
> proper outcome for the issues they're encountering."
>
> Agreed (though we probably mean the end users IT managers ?).
There are an increasing number of end-users managing servers, and that
trend will not change. Some folks have the skills, some outsource the
management or depend on their software supplier, and other folks just
leave the box to run.
> "It's increasingly common for applications to avoid user-visible crash
> logs, but to collect and encrypt and upload that data for analysis."
>
> Agreed again, subject to a few security caveats.
Opt-in, and using local private key and vendor public key to
authenticate and to encrypt.
> So what would it take (other than some presentation layer stuff :)) to
> have VMS combine process dump, system dump (live or post mortem), error
> logs, etc and maybe even stuff from DECamds and friends, and email it
> off to the authorised service provider.
A daemon to manage the processing and possibly with the assistance of
Apache Zookeeper, current crypto and per-server certificates, probably
an XML or JSON library to structure the data, a variety of data
collectors for collecting the current patch data and as an alternative
or replacement for the last-chance dump handler, CLUE CRASH for
processing dumps, and some other giblets. Updates to PCSI and/or a
replacement installer, too. There's much more to do on the server
(VSI) end of the connection, as that'll involve processing all that
data in an automated fashion, as well as determining the entitlements
or however VSI decides to offer patches.
Since it's VSI software, the uploads will go there for processing by
default. It'd be nice to have a linkable framework that allows
application crashes to be sent elsewhere, though that's obviously
possible now.
> Maybe the first line service provider needn't even be HP or VSI?
Via VSI partner, most likely.
> This concept has a distinct 1990s deja vu about it... can't give it a
> name though.
A whole lot of this dates back to then, though better instrumentation
has become easier and more common.
> It's not going to be top of VSI's priority list, but the infrastructure
> for gathering the relevant information is already there, architected
> in, generally available, and frequently used constructively if people
> can be bothered doing more than "have you tried rebooting it?".
The goal is to avoid the "have you tried rebooting it?" — entirely.
In general....
What is typical at many OpenVMS sites is worse than what DEC had back
in the 1990s, and what OpenVMS has now is vastly behind what's typical
and current.
How does VSI keep folks from _this_ era from getting into even deeper
sneakers than that customer?
How does VSI avoid reading way too many crashdumps?
Having an on-site staff for OpenVMS isn't as common as it once was.
Dealing with this is not with email-based notifications — not unless
the end-user or the partner wants email notifications. For data
uploads, those are not via email. Dealing with this best not with
manual crashdump scanning. It's increasingly not with displaying
crashes to end-users, either.
How this best goes forward is with none of what was. It's automated
tools and direct connections, probably via HTTPS/443. Directly
uploading crash data and automated scanning for known patterns. It's
automatically-staged downloads, and push-button patch installs.
It's with time-to-patch being much, much faster than OpenVMS ever was
before, too. Vulnerabilities and particularly security vulnerabilities
will only be exploited more quickly, after all.
It's with configuration and crash and application failure information
uploaded to VSI and potentially then shared with VSI partners, and
where VSI or the partners then deliver the support to the end-user
installs. Or yes, the end-users that are providing self-maintenance,
and they can then test the patches and then press the patch button on
the production servers.
In years past, Canasta/CCAT
<http://www.decus.de/slides/sy2003/09_04/2k01.pdf> and the old DEC
proactive services offerings were part of this, but that was not as
integrated and it wasn't as automated as it should have been. If
anything, opt-in collecting of crashes from everybody — support
contracts or not — makes sense for a variety of reasons. It gets VSI a
whole lot of useful data.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list