[Info-vax] PowerX Roadmap - Extended beyond 2020

Thu Sep 15 16:53:19 EDT 2016

> -----Original Message-----
> From: Info-vax [mailto:info-vax-bounces at rbnsn.com] On Behalf
> Of IanD via Info-vax
> Sent: 15-Sep-16 1:49 PM
> To: info-vax at rbnsn.com
> Cc: IanD <iloveopenvms at gmail.com>
> Subject: Re: [Info-vax] PowerX Roadmap - Extended beyond 2020
> 
> On Sunday, September 11, 2016 at 7:55:13 AM UTC+10, David
> Froble wrote:
> > IanD wrote:
> >
> > > Not sure where OpenVMS is going to fit in the IoT picture,
it's
> not
> > > lean enough or it's file system not quick enough to act as
a
> data
> > > collector. Maybe as an aggregator?
> >
> > You've stated things like this in the past.  You got any
citations,
> > facts, or such to back up your statements?
> >
> > While I don't have any specifics, I remember reading years
ago
> how the
> > size of VMS compared to weendoze, and the comparison was
> rather
> > favorable for VMS.  Much smaller footprint.
> >
> > With the memory available today, I'm not sure how much a
> difference in
> > footprint matters, in comparison to capabilities.
> >
> > You also got to differentiate between the OS and the
utilities
> that
> > come with it.  In an embedded situation, much of the
utilities
> would perhaps not be included.
> >
> > When you mention "file system", are you really referring to
the
> file
> > system, or to RMS?  I can do some rather fast I/O on VMS.
> 
> When one looks at things like the Gartner report on IoT for
2017 -
> 2018, the power requirements of devices is going to need to be
> extremely low
> 
> As to the slowness of VMS file systems, I was referring to RMS
> since that is the layer most people work with and I doubt
anyone
> that is developing for IoT is going to bother with anything
lower
> level than the native file system on the device / OS they are
> implementing on, especially since there are already protocols
and
> libraries out there that people are levering for software
> development (which will still have to be ported to OpenVMS if
> OpenVMS is going to participate)
> 

RMS is not as slow as one thinks. It can be very fast if you
understand the application and using direct IO's.

With all relational DB's, there is usually an internal function
called a query optimizer. This is an internal function which
receives a query, then determines if the query should be executed
in index or sequential mode. There is overhead associated with
this and the output of the query optimizer is not always correct
(logic errors in query, or optimizer bugs) and hence, a query
that should use an index, might incorrectly decide to go
sequential. This is a classic case where all of a sudden a query
that takes 5 seconds normally, all of a sudden takes over a
minute. Symptoms are very much higher than normal DB IO's.

Now, the current RMS design has issues with maint, online backups
and likely a few other issues, so there is a trade-off.

However, why look at addressing future requirements with today's
technology? 

We know there is a new file system coming on OpenVMS and I would
expect quite a few of the current RMS issues to be addressed with
the new design - including better performance.

> I would be extremely surprised if anyone wrote code to go block
> mode I/O on OpenVMS for data capture in the IoT space either
> 
> High transaction rate environments resort to items like
sharding
> and distributed DB's like NoSQL Cassandra etc as well as other
> techniques. So far OpenVMS doesn't have anything like these
> technologies to my limited knowledge. At the device level the
> options are stripping but then you get hit with lack of
redundancy
> which isn't going to fly in most environments and even
stripping
> isn't going to save you for lots of small data writes which is
what
> IoT will be primarily focused on
> 

There are huge load balancing trade-offs with distributed DB
sharding. In a nutshell, you assign different parts of the DB to
specific nodes. Each node can only update directly that part of
the DB it is assigned to.  If one part of the DB becomes a hot
spot that exceeds the requirements of that single node, then your
only option is to replace that server with a bigger system,
re-design and re-partition the DB or return an error to the
application.

You have to design a sharded DB exceptionally well so you really
need to understand your workloads. That is the core of the
NonStop world. In their financial world, they understand their
transactions very well, but the big 800lb gorilla in every
NonStop environment is what happens if a workload exceeds the
capacity of one node?

A good WP that compares shared everything/disk DB's (OpenVMS,
Linux/GFS, z/OS) vs. shared nothing (Linux, Windows, UNIX,
NonStop) can be found here:
http://www.scaledb.com/wp-content/uploads/2015/11/Shared-Nohing-v
s-Shared-Disk-WP_SDvSN.pdf
""Comparing shared-nothing and shared-disk in benchmarks is
analogous to comparing a dragster and a Porsche. The dragster,
like the hand-tuned shared-nothing database, will beat the
Porsche in a straight quarter mile race. However, the Porsche,
like a shared-disk database, will easily beat the dragster on
regular roads. If your selected benchmark is a quarter mile
straightaway that tests all out speed, like Sysbench, a
shared-nothing database will win. However, shared-disk will
perform better in real world environments."

> In time, OpenVMS might participate in some up-stream data
> aggregation but I seriously don't see it acting in the data
collection
> part of the spectrum
> 
> The sorts of things being looked at for IoT is ballooning and
the
> spectrum of what people are wanting to capture data on is
> growing all the time
> 
> It's going way beyond wanting to capture data out of your
> toaster, there is not much of a commercial drive behind wanting
> to know how your toaster performed last night ;-)
> 
> Things like Smart Concrete however and items used in public
> infrastructure are certainly prime targets. Knowing if/when
public
> infrastructure like a bridge might collapse or be subject to
> extreme forces etc are of high interest.
> 
> Imagine a dam with literally 10's of 1000's of collection
points
> embedded in the concrete all sampling and sending their data
> back. You are talking about a lot of small quick data packets.
> 
> There is an dam not that far from where I live. It's small but
it's 66
> m x 390 m long. If you place a sensor in the concrete at say 1
m
> intervals, your talking about 25K sensors. If you sample at
even a
> paltry 2x's per second, which for embedded devices is near in a
> sleep cycle, that's 50K samples per second of data. Can RMS
take
> in data at those rates without issue? 50K writers at once?
> 
> http://h20565.www2.hpe.com/hpsc/doc/public/display?docId=e
> mr_na-c04618690
> 
> This was an interesting find, this is OpenVMS with SSD support.
> Some of the upper range shown here is below even the modest
> example I made up above for the dam and HP were testing 4K
> writes, not what IoT will be targeting, which will probably be
> under 1K writes. I really think (without proof) that RMS will
> bottleneck quickly, especially in trying to keep it's index
current
> 

See notes above. 

Remember - new file system and other new core things coming. 

We should stop trying to address the future 5+ year requirements
of tomorrow using today's limitations when we know OpenVMS has a
new engine (file system) and new wheels (TCPIP stack) and a new
body (X86-64) coming in the next 18-24 months. 

Also, as pointed out in the WP, one needs to consider the
benefits of being able to load balance IO requests across all
available back end systems vs. DB sharding across many small
systems across high latency LAN networks (net writes latency vs.
local memory, flash disk) before the system is deployed, then
having to deal with hot spots or unplanned workloads or DR later.

> IoT will drive the whole data / storage industry up another
notch
> 
> We will see the early adopters take the lions share of the IOT
> space and I happen to think that will be linux yet again :-(, I
really
> don't think OpenVMS is in any shape at present to even begin to
> participate, it's having enough fun and games getting itself
onto
> x86
> 
> The rebuilding of OpenVMS is going to need to address why
> people abandoned the platform in the first place, it's not just
a
> lack of x86 support. People are coding for other architectures
> currently and are doing so I think primarily because of good
> porting tools and excellent development frameworks and Open
> source is now just not a nice to have but an essential
> 

Open source is indeed another good tool to have on one's tool
belt.  More tools usually makes for a better carpenter.

However, there are trade-off's. Each solution architect
(carpenter) has to review these to determine what tools are right
for their environment. 

In most cases, there will likely be a mix of custom code and open
source.

> On a philosophical front, man seems hell bent on sampling
> everything possible in the hope of controlling his environment
> and ultimately planning his existence. I happen to think it's
folly to
> pursuit such things to the nth degree but until this approach
as
> abandoned then expect IoT to keep getting more wild in it's
hype
> and promises. I mean if central banks cannot give up on their
> notion of a controlled economy (yeah, how well has that been
for
> the planet!), then what hope is there that IoT will be de-hyped
in
> the near future? i.e. none!
>

IoT is like Public Cloud, SDN, IT Utility, Adaptive Enterprise,
SOA, Real-Time Enterprise and a host of so many other industry
hype terms.

There is some truth that is just a re-invention of existing
technologies behind each of these, but the definition of each is
left up to the individual, so in the end, you can define these
terms as anything you want.

Regards,

Kerry Main
Kerry dot main at starkgaming dot com