[Info-vax] New filesystem mentioned
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Wed May 15 12:50:22 EDT 2019
On 2019-05-15 12:31:28 +0000, Bob Gezelter said:
>> That's the point of an ACP. (And the problem with an ACP.) All I/O
>> for all users goes through a single process. There is no need for the
>> DLM.
If by "single process" you meant single path, that's correct. ACPs on
OpenVMS aren't usually in the primary path for an I/O, as passing data
out of kernel and back in is a performance problem.
Designs such as this are also used with clustering. For a clustering
solution such as Xsan or various similar apps, one host coordinates the
activities, and that host is effectively the DLM. In Apple
terminology, this host is the metadata controller (MDC). Xsan uses a
primary-secondary design, with any secondary MDCs present mirroring the
primary. This isn't far off of how the OpenVMS cluster connection
manager operates.
>> The ACP is responsible for its caching. The ACP is a weird mixture of
>> process context and device driver context. It has access to the
>> various synchronization mechanisms available to both processes and
>> drivers. It can use the DLM, but it does not have to.
Correct. And it's definitely a weird mix. The only doc is from
existing examples, and from Jamie Hanrahan's Advanced VMS Device Driver
Techniques book. I used the former approach to learn and write the
first ACP. Do *not* use NETACP as your template example. NETACP is
weird even for an ACP. Best to look at the magtape ACP. As for the
book, it's pretty good but does have a few errors. It's also long out
of print. There are some example ACPs on the Freeware, for those that
don't have access to the source listings. (Whether VSI will be
generating source listings, assuming sufficient rights are available to
VSI to offer that? But I digress.)
> With all due respect, I do not agree with this characterization of an ACP.
>
> An ACP does not do away with the need for a DLM. An OpenVMS cluster
> running shared volumes requires DLM functionality, as there is more
> than one ACP per volume accessing the on-disk file structure data
> structures. Orthogonally, the DLM is used for coordinating RMS-level
> intra-file activity. The bright-line difference between an ACP and the
> FILES-11 Level 2/5 XQP is that the XQP operates at inner-mode(s) within
> the requesting process rather than a separate process as is the case
> with the XQP.
There are lots of ways for systems and apps to cluster. The classic
OpenVMS design is not the only way, and the OpenVMS design does have
some issues. As for other approaches, Linux has integrated DLMs and
APIs from Red Hat and from Oracle. Apple uses MDCs. Oracle layered
clustering support atop ZFS. Apps can be and commonly are coded to
cluster and often using existing support such as Hadoop, too. Etc.
And a whole lot of folks can and do use a high-availability design with
failover, whether it's a file server that's failing over, or a database.
On OpenVMS, ACPs are usually mediating the user-mode requirements of a
local device, though nothing here precludes a metadata controller
design and routing some of the remote storage access through a host,
and for various reasons that I/O centralization can be simpler than
spreading everything out including the complexity of the coordination.
Communications among hosts would be via user-mode or more likely via
the kernel communications driver interface known as VCI.
Long ago, OpenVMS development prototyped something that could have been
used here, too. That project was called QIOserver. Getting this stuff
right is Not Easy, which is why most folks use an existing
implementation, be it the OpenVMS DLM, DLM-like support in another
operating system, or an available add-on integrated with the app.
XQPs were centrally a performance optimization over ACPs, seeking to
avoid swapping between processes—OpenVMS was classically poor at that,
and that whole design didn't really become ~competitive until ~L4—by
mapping the relevant code into each user process. AFAIK, there's
little difference around what I/O activity traversing across modes,
though not having to copy buffers around and into and out of a separate
process is beneficial. XQPs are Not Fun to debug. ACPs are definitely
easier in that regard for the user-mode chunk of their debugging, and
the ACP debugging I have been using for many years uses the debugger
and a remote DECterm for that. Kernel mode debugging is still using
XDELTA (an old DEC enet host with that name was almost always booted
with XDELTA loaded, too) or more recently the System Code Debugger.
> The implementation of the file system auxiliary processing as either
> intra-driver, ACP, or XQP is transparent to the user. The actual
> interface provided by an ACP/XQP is described in the OpenVMS IO User's
> Manual.
In the OpenVMS design, yes, that and some undocumented shenanigans
around mounting and dismounting—which is another area of OpenVMS that's
a dog's breakfast, and particularly if you're not using an ACP with a
file-oriented device, as I and others have used those ACPs—to make the
volumes accessible. This area is also a mess around USB removable
device support, and we'll probably eventually see some improvements
here.
> Could a linux-style VFS-like library be implemented on OpenVMS? Could
> such a library framework be implemented as an XQP? Likely the answer to
> both questions is "Yes". As usual, the devil is in the details.
If VSI were to consider adding a FUSE layer, it'd almost inherently
involve reworking or rewriting the existing XQP to play in the new
environment, as well as work on mount and dismount and other services.
Obvious candidates here would include a reworking an NFS client,
reworking and updating the ODS-3 and ODS-4 ISO-9660 support, rework or
replace the EFI FAT support—and whether that ends up in a FUSE layer or
akin to the XQP, and the addition of an SMB client. NFS is one of the
few add-on file system clients that exists for OpenVMS, too. Not that
the IP stack should be separately packaged and separately installed,
but it is.
> An ACP process failing is bad, but does not always lead to a kernel
> fault. An XQP-like component, operating in kernel mode, will almost
> always cause a crash. Which is better? Mileage varies. Depends.
OpenVMS is a monolithic kernel. ACPs are part of the kernel. So too
is the XQP. Though even an L4-like operating system design can
certainly get tangled up and tip over. The FUSE API approach though,
is deliberately intended be resilient, and to allow the file system
code tip over, and to be _much_ easier to debug, and much easier to add.
The whole of the OpenVMS I/O subsystem design is far too trusting of
the underlying hardware, though that's a different problem. That'll
prolly all get ignored for a while due to the "servers are isolated in
server rooms" view, at least until hosting becomes part of the
discussion. Messes can arise even with private hosting, though that's
going to be a little less common initially. There's the obvious
problems with untrusted file system mounts, whether local or remote.
There's quite a bit of research happening on contending with
firmware-level persistence and exploits too, with Amazon Nitro and
other work underway. For an intro to some of the issues that can arise
here beyond intentionally-corrupted USB devices:
https://www.youtube.com/watch?v=PEVVRkd-wPM
TL;DR: The availability of VAFS should resolve some of the major issues
that folks with ODS-2 and ODS-5 are encountering. There's yet more
work awaiting beyond VAFS. It'll be some years before VSI starts
addressing other issues latent in the I/O subsystem. That work never
ends. FUSE and replacing the current ACP and XQP design and
documenting it all is probably five or ten years out at best, and
associated with no small investment in renovating and updating related
parts of the kernel.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list