[Info-vax] "The Machine"
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Sun Aug 30 17:29:57 EDT 2015
On 2015-08-30 20:28:28 +0000, IanD said:
> Seems this machine idea is focusing on getting the core component of
> memory and in particular, extremely large amounts of shared memory
> working and from there you'll be able to plug in other OS's eventually.
> Interesting.
Maintaining cache coherency and having large memory and having lots of
cores means your server scale is limited. The longer your wiring
connections, the more limited the maximum performance over those links.
Plus, the more cores and the more memory you have, the more
hardware-level overhead involved with keeping all the caches coherent,
or reloading data.
OpenVMS expects certain things from the underlying hardware as part of
SMP support, and having a consistent view is among those expectations.
History: VAX ASMP servers didn't have and didn't need a consistent view
of the hardware. (Yes, there are some VAX systems where the console is
connected to the primary or such; where there's a slight asymmetry to
the server hardware.) But the ASMP performance was limited by this,
and code requiring inner-mode processing had to be kicked over to the
primary. This could lead to flailing. Also to the use of a tool such
as "qualify", to see if the system activity would benefit from ASMP.
The addition of SMP support at VAX/VMS V5.0 redesigned that older
computing environment, though. Different hardware requirements from
ASMP, but with benefits from those changes.
Now the downside is that the more processors or the more cores you add,
the more likely you end up stuck in MPSYNCH. There are various cases
where adding cores reduces aggregate performance, too.
> This could stem the tide of everyone being shoehorned into a linux
> existence if other OS's learn to play nice on this system when complete?
Linux already runs on very big configurations. Very, very big.
Configurations vastly larger than what OpenVMS can deal with.
> This is a different idea to my idea of reworking VMS clusters to
> support virtual VMS cluster, the Machine could potentially create
> systems that are a hell of a lot faster
The coordination inherent in OpenVMS-style clustering means you can't
get big clusters.
> I was looking at some other related ideas a while ago around
> distributed memory and got me to thinking yet again about how VMS could
> expand it's cluster vision
>
> I'm an advocate of a virtual cluster type of arrangement, where you can
> combine nodes or even clusters into a virtual cluster and share
> resources across them, including memory so that a virtual process is a
> silhouette for a real compute process that sits out there 'somewhere'
> in a real cluster with resources mapped against the entire virtual
> cluster that is is part of so that it can use those resources as those
> it was attached to an infinitely large machine. The HP Machine idea is
> in some ways similar, except very tightly integrated
RDMA. Or remote access via NVMe. Newer-better-faster versions of
what Memory Channel and reflective memory was supposed to provide, in a
manner of consideration. One recent demo reportedly saw 1.5 to 2.5
microseconds latency from server to remote server non-volatile memory,
via NVMe and a fast network connection.
> While we are at it, why limit VMS clusters to VMS? The clustering
> component once ported to x86-64 could be potentially licensed to run on
> other OS's, in time.It's not so much the underlining OS that get people
> excited IMO, it's how to tie them together in larger and larger bundles
> to handle larger and larger workloads and do it seamlessly seem to me
> is whats exciting
Might want to have a look at Mesos and Spark, and related packages, and
at what's already possible with thousands of servers in a cluster,
then. Big Unix clusters — way beyond the supported limit of 96 hosts
for OpenVMS clusters — are not at all rare.
> I'm of the opinion VMS clusters need an overhaul (in time of course,
> gotta get x86-64 happening first). Linux clusters have usurped the
> cluster name, so VMS needs to reinvent what clustering can do
OpenVMS clustering doesn't scale all that large, by present-day
standards. It's a good choice for a range of solutions, but if you
need bigger installations, you'll need to rethink things and not the
least of which is the coherency, and if and how you're going to divide
up your data and your processing. Then there are discussions of
redundancy and upgrades and other necessities.
> Is this Machine, the ultimate future of enterprise computing?
HPE is supposedly betting the company on it, per some reports:
<http://www.freerepublic.com/focus/f-news/3167472/posts>. Needing a
new and wholly-rethought operating system is a huge bet here, too.
That's inherently going to slow adoption, in the absence of some very
enticing incentives; price, scale, performance, etc.
> Ultra-large memory slabs like a giant VMware system that allows all
> those individual OS's running on it to finally be able to share their
> memory at speeds that don't cripple them?
>
> Once you go down this path, what next? Concepts like data sharing
> through file interface agreements, or exchange formats or even network
> traffic between machines all go away if one can share memory at speed.
> The possibilities really start to expand
Add in non-volatile memory much closer to the processors (3D Xpoint,
etc), which means you don't need to fuss with the complexity and (lack
of) performance of the traditional I/O stack and PCIe-based storage.
Rebooting becomes much closer to a VAX-11/780 with battery backup — the
origin of the "restart" setting in various consoles — or that of a
smartphone, than what most folks are currently familiar with. You
have to remember to clear or set the memory appropriately when you shut
down, too.
As for the performance of the I/O stack, Linux network I/O is faster
than TCP/IP Services. Much faster. Existing software stacks such as
OpenGL just aren't fast enough, hence the introduction of Vulkan.
Overhauls and replacements elsewhere in the OpenVMS operating system
I/O stack are necessary, too.
> If I understand this correctly, computing becomes almost parallel in
> nature again.e.g. If I have an image of a brain scan of a patient
> loaded into memory on one system that did the capture, now I have
> another machine or machines scanning that image in shared memory
> looking for brain abnormalities, at the same time I am archiving that
> image to long term storage at the same time my billing system is
> updating. So the slowest pipe becomes getting the data into memory but
> once it is there, a whole world of parallel possibilities open up. This
> is interesting indeed.
We're already in that world. One socket gets you 72 cores, with the
upcoming Knights Landing. HPE Apollo gets you some serious computes.
Lower-end server and computer designs have been somewhat slower to
shift, though prices for the faster gear and faster designs are
starting to lower. SSDs are an intermediate step here that most
everybody is familiar with — they didn't require much of a change to
software or servers. But changes are coming, and well beyond swapping
out "disks" for "much faster disks".
> VMS had better shake a leg if this is the type of thing it's going to
> have to compete against...
I'm looking increasingly at how little time is required for certain
operations at the hardware level — microsecond-range speeds — and how
long it takes to thread that same data through the whole software stack
morass. How the existing I/O stacks are going to be optimized or
overhauled? This goes far past clustering, too. If you're running
hardware capable of performing some operation in one or to
microseconds, then — outside of bazillions of operations — going to
faster hardware probably won't help. You have to look at making the
application and the system software faster, if that's where you're
spending the bulk of your time.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list