[Info-vax] "The Machine"

Sun Aug 30 17:29:57 EDT 2015

On 2015-08-30 20:28:28 +0000, IanD said:

> Seems this machine idea is focusing on getting the core component of 
> memory and in particular, extremely large amounts of shared memory 
> working and from there you'll be able to plug in other OS's eventually. 
> Interesting.

Maintaining cache coherency and having large memory and having lots of 
cores means your server scale is limited.    The longer your wiring 
connections, the more limited the maximum performance over those links. 
  Plus, the more cores and the more memory you have, the more 
hardware-level overhead involved with keeping all the caches coherent, 
or reloading data.

OpenVMS expects certain things from the underlying hardware as part of 
SMP support, and having a consistent view is among those expectations.

History: VAX ASMP servers didn't have and didn't need a consistent view 
of the hardware.  (Yes, there are some VAX systems where the console is 
connected to the primary or such; where there's a slight asymmetry to 
the server hardware.)   But the ASMP performance was limited by this, 
and code requiring inner-mode processing had to be kicked over to the 
primary.  This could lead to flailing.  Also to the use of a tool such 
as "qualify", to see if the system activity would benefit from ASMP.   
The addition of SMP support at VAX/VMS V5.0 redesigned that older 
computing environment, though.  Different hardware requirements from 
ASMP, but with benefits from those changes.

Now the downside is that the more processors or the more cores you add, 
the more likely you end up stuck in MPSYNCH.   There are various cases 
where adding cores reduces aggregate performance, too.

> This could stem the tide of everyone being shoehorned into a linux 
> existence if other OS's learn to play nice on this system when complete?

Linux already runs on very big configurations.  Very, very big.   
Configurations vastly larger than what OpenVMS can deal with.

> This is a different idea to my idea of reworking VMS clusters to 
> support virtual VMS cluster, the Machine could potentially create 
> systems that are a hell of a lot faster

The coordination inherent in OpenVMS-style clustering means you can't 
get big clusters.

> I was looking at some other related ideas a while ago around 
> distributed memory and got me to thinking yet again about how VMS could 
> expand it's cluster vision
> 
> I'm an advocate of a virtual cluster type of arrangement, where you can 
> combine nodes or even clusters into a virtual cluster and share 
> resources across them, including memory so that a virtual process is a 
> silhouette for a real compute process that sits out there 'somewhere' 
> in a real cluster with resources mapped against the entire virtual 
> cluster that is is part of so that it can use those resources as those 
> it was attached to an infinitely large machine. The HP Machine idea is 
> in some ways similar, except very tightly integrated

RDMA.   Or remote access via NVMe.   Newer-better-faster versions of 
what Memory Channel and reflective memory was supposed to provide, in a 
manner of consideration.   One recent demo reportedly saw 1.5 to 2.5 
microseconds latency from server to remote server non-volatile memory, 
via NVMe and a fast network connection.

> While we are at it, why limit VMS clusters to VMS? The clustering 
> component once ported to x86-64 could be potentially licensed to run on 
> other OS's, in time.It's not so much the underlining OS that get people 
> excited IMO, it's how to tie them together in larger and larger bundles 
> to handle larger and larger workloads and do it seamlessly seem to me 
> is whats exciting

Might want to have a look at Mesos and Spark, and related packages, and 
at what's already possible with thousands of servers in a cluster, 
then.  Big Unix clusters — way beyond the supported limit of 96 hosts 
for OpenVMS clusters — are not at all rare.

> I'm of the opinion VMS clusters need an overhaul (in time of course, 
> gotta get x86-64 happening first). Linux clusters have usurped the 
> cluster name, so VMS needs to reinvent what clustering can do

OpenVMS clustering doesn't scale all that large, by present-day 
standards.  It's a good choice for a range of solutions, but if you 
need bigger installations, you'll need to rethink things and not the 
least of which is the coherency, and if and how you're going to divide 
up your data and your processing.   Then there are discussions of 
redundancy and upgrades and other necessities.

> Is this Machine, the ultimate future of enterprise computing?

HPE is supposedly betting the company on it, per some reports: 
<http://www.freerepublic.com/focus/f-news/3167472/posts>.  Needing a 
new and wholly-rethought operating system is a huge bet here, too.  
That's inherently going to slow adoption, in the absence of some very 
enticing incentives; price, scale, performance, etc.

> Ultra-large memory slabs like a giant VMware system that allows all 
> those individual OS's running on it to finally be able to share their 
> memory at speeds that don't cripple them?
> 
> Once you go down this path, what next? Concepts like data sharing 
> through file interface agreements, or exchange formats or even network 
> traffic between machines all go away if one can share memory at speed. 
> The possibilities really start to expand

Add in non-volatile memory much closer to the processors (3D Xpoint, 
etc), which means you don't need to fuss with the complexity and (lack 
of) performance of the traditional I/O stack and PCIe-based storage.   
Rebooting becomes much closer to a VAX-11/780 with battery backup — the 
origin of the "restart" setting in various consoles — or that of a 
smartphone, than what most folks are currently familiar with.   You 
have to remember to clear or set the memory appropriately when you shut 
down, too.

As for the performance of the I/O stack, Linux network I/O is faster 
than TCP/IP Services.  Much faster.   Existing software stacks such as 
OpenGL just aren't fast enough, hence the introduction of Vulkan.  
Overhauls and replacements elsewhere in the OpenVMS operating system 
I/O stack are necessary, too.

> If I understand this correctly, computing becomes almost parallel in 
> nature again.e.g. If I have an image of a brain scan of a patient 
> loaded into memory on one system that did the capture, now I have 
> another machine or machines scanning that image in shared memory 
> looking for brain abnormalities, at the same time I am archiving that 
> image to long term storage at the same time my billing system is 
> updating. So the slowest pipe becomes getting the data into memory but 
> once it is there, a whole world of parallel possibilities open up. This 
> is interesting indeed.

We're already in that world.   One socket gets you 72 cores, with the 
upcoming Knights Landing.  HPE Apollo gets you some serious computes.  
Lower-end server and computer designs have been somewhat slower to 
shift, though prices for the faster gear and faster designs are 
starting to lower.  SSDs are an intermediate step here that most 
everybody is familiar with — they didn't require much of a change to 
software or servers.   But changes are coming, and well beyond swapping 
out "disks" for "much faster disks".

> VMS had better shake a leg if this is the type of thing it's going to 
> have to compete against...

I'm looking increasingly at how little time is required for certain 
operations at the hardware level — microsecond-range speeds — and how 
long it takes to thread that same data through the whole software stack 
morass.   How the existing I/O stacks are going to be optimized or 
overhauled?   This goes far past clustering, too.  If you're running 
hardware capable of performing some operation in one or to 
microseconds, then — outside of bazillions of operations — going to 
faster hardware probably won't help.  You have to look at making the 
application and the system software faster, if that's where you're 
spending the bulk of your time.

-- 
Pure Personal Opinion | HoffmanLabs LLC