[Info-vax] New filesystem mentioned
Dave Froble
davef at tsoft-inc.com
Tue May 14 12:32:43 EDT 2019
On 5/14/2019 11:19 AM, Stephen Hoffman wrote:
> On 2019-05-14 12:21:12 +0000, Simon Clubley said:
>
>> On 2019-05-14, John Reagan <xyzzy1959 at gmail.com> wrote:
>>> You seem to keep forgetting about shared access to the allocation
>>> bitmap, shared accesses to files, etc. Sure, any node can $QIO to
>>> any block #, but the file system uses locks to synchronize across
>>> cluster nodes. I searched the XQP listings for calls to $ENQ and
>>> found at least 50 of them.
>>
>> Yes, but those are all filesystem driver issues. I don't see what is
>> required in the on-disk structure itself to support VMS style clustering.
>>
>> As far as I can see, if VMS had a plug-in filesystem architecture,
>> then there is no conceptual reason why you couldn't have a FAT32
>> filesystem (for example) on VMS that was fully cluster aware and which
>> would continue to be fully operational after a cluster state transition.
>>
>> All it would need is for rules to be published in the filesystem
>> driver writing documentation about how to make your filesystem driver
>> cluster aware.
>
> Sure. It's a Simple Matter Of Programming. A SMOP.
Isn't everything?
> For shared access, OpenVMS Clustering adds file system requirements
> around the use of heavier and slower locking, around metadata storage
> and retrieval, and around additional coordination of the I/O and
> metadata caching.
When doing more, usually it takes longer.
> File systems and apps operating locally can use faster locking
> primitives, where file systems and apps operating in an OpenVMS Cluster
> configuration requires retrofitting what are inherently heavier and
> inherently slower and cluster-aware Distributed Lock Manager (DLM)
> calls. (This if the reference design for the file system is not using a
> lock-free implementation, obviously.) In OpenVMS terms, this means
> replacing some or possibly all of the existing and host-local bitlock
> operations or the lock-free locks with DLM calls.
Yeah, that's reasonable.
> The Record Management System (RMS) metadata details have to be stored
> somewhere. Or reconstituted somehow. Metadata such as that associated
> with sequential, relative and indexed files will have to be persisted in
> some file system data structure.
It's always been my impression that the RMS metadata is stored in the
data records and directory entries. Such as the two bytes (I seem to
recall) at the front of each relative file record. Not any part of the
filesystem.
> This gets interesting when adopting
> some file systems such as the Microsoft FAT or ExFAT file systems,
> ISO-9660:2013, ISO-9660:2017, UDF, or other more portable or more
> transportable file systems. The other systems would have to ignore or
> to support the RMS metadata, or RMS and RMS-using apps would have to be
> modified to remove or rework the related system and app-related metadata
> requirements.
I really don't understand this concern. If directory entries contain
some RMS metadata, and I think they do, that can just be ignored by
things that do not need the data.
> Maybe BACKUP would then get fixed here too, assuming
> BACKUP would not be replaced by an improved and faster design? For
> related details here, see the user- and API-visible portions of the
> OpenVMS work underlying the undefined file attributes support added for
> the ODS-3 and ODS-4 ISO-9660:1988 file system support.
>
> Making metadata dependencies optional in the higher-level software and
> in the apps would be very useful, particularly when adopting a plug-in
> scheme for file system support. This for kernel support, as well as
> support for a FUSE-like file system in user space. Basically, this
> means that some volumes might be "RMS immune" or "RMS incompatible", in
> practical terms. You get a stream of bytes or a stream of sectors and
> both being sequential organizations, or you can get to use your own
> record structures on this file system, or you can use an integrated or
> add-on database package for your storage.
>
> A host reading and writing the file system and its metadata must also
> cause other hosts sharing that storage to manage the local and
> now-incoherent cache contents appropriately; cache reloading, cache
> eviction and/or wholesale cache flushing; there must be cache
> coordination. While SSD storage is much faster than Hard Disk Drive
> (HDD) storage, main memory is yet faster still, which means there'll
> still be interest in I/O caching in main memory. (A design difference
> in the existing OpenVMS caching underlies the Shared Stream I/O (SSIO)
> corruptions, too.
> http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf )
Way back in the day, when we were designing the DAS database product we
looked at caching. When considering it on a single system, it wasn't so
bad. Then we looked at doing so in a VMS cluster, and after all the
screaming from fright and such was over we declared "we ain't going there!"
> Host-based Volume Shadowing (HBVS, OpenVMS software RAID-1) is also in
> play here, as HBVS has knowledge of and write access to specific parts
> of the ODS-2 and ODS-5 file system structures, as well as coordinating
> the copy and merge fences when those are active. All of this will either
> require integration with the existing file system, or reworking existing
> features, or replacing HBVS with ZFS or other features of the particular
> new file system. Or the expedient of disallowing these features with
> specific new file systems. HBVS sort-of broke the classic I/O layering
> model.
>
> IIRC, the StorageWorks RAID (striping, RAID-0) product did not involved
> ODS-2 and ODS-5 metadata access, but that would have to be investigated
> and mitigated if it did. IIRC, StorageWorks RAID didn't break the
> classic I/O layering model, unlike ZFS.
>
> Around ZFS, there can either be changes within ZFS or a layer added atop
> ZFS. For example:
> https://github.com/ewwhite/zfs-ha/wiki
> https://blogs.oracle.com/solaris/cluster-file-system-with-zfs-introduction-and-configuration
>
>
> Some of the more general high-availability work that's been happening:
> https://www.clusterlabs.org/
>
> And as for app changes around adopting VAFS, VSI was claiming that most
> of the underlying VAFS changes would be transparent to apps. For apps to
> fully adopt VAFS capabilities, required changes will involve promoting
> longword sector address and file sector size values to quadwords in
> $qio, $io_perform and the RMS services data structures, and in adopting
> whatever replaces the XQP File ID (FID) or other bits of the file system
> that have been presented. This is pretty much any app that's directly
> accessing RMS data structures or performing virtual or logical $qio or
> $io_perform operations, and particularly when apps performing logical
> I/O are running with files residing on an HDD or SSD past two gibibytes
> in total capacity and/or when app using logical or virtual I/O are
> addressing files larger than two gibibytes. What other app changes may
> or will be required when adopting VAFS, we will learn as the VAFS
> release approaches.
Well, now you've really messed up my day, and perhaps week, month, year,
and life. I've got a database product that knows about Longwords. It
doesn't know about quadwords. The implications are staggering.
Of course, you'll just say "SMOP" ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
More information about the Info-vax
mailing list