[Info-vax] New filesystem mentioned
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Tue May 14 11:19:21 EDT 2019
On 2019-05-14 12:21:12 +0000, Simon Clubley said:
> On 2019-05-14, John Reagan <xyzzy1959 at gmail.com> wrote:
>> You seem to keep forgetting about shared access to the allocation
>> bitmap, shared accesses to files, etc. Sure, any node can $QIO to any
>> block #, but the file system uses locks to synchronize across cluster
>> nodes. I searched the XQP listings for calls to $ENQ and found at
>> least 50 of them.
>
> Yes, but those are all filesystem driver issues. I don't see what is
> required in the on-disk structure itself to support VMS style
> clustering.
>
> As far as I can see, if VMS had a plug-in filesystem architecture, then
> there is no conceptual reason why you couldn't have a FAT32 filesystem
> (for example) on VMS that was fully cluster aware and which would
> continue to be fully operational after a cluster state transition.
>
> All it would need is for rules to be published in the filesystem driver
> writing documentation about how to make your filesystem driver cluster
> aware.
Sure. It's a Simple Matter Of Programming. A SMOP.
For shared access, OpenVMS Clustering adds file system requirements
around the use of heavier and slower locking, around metadata storage
and retrieval, and around additional coordination of the I/O and
metadata caching.
File systems and apps operating locally can use faster locking
primitives, where file systems and apps operating in an OpenVMS Cluster
configuration requires retrofitting what are inherently heavier and
inherently slower and cluster-aware Distributed Lock Manager (DLM)
calls. (This if the reference design for the file system is not using
a lock-free implementation, obviously.) In OpenVMS terms, this means
replacing some or possibly all of the existing and host-local bitlock
operations or the lock-free locks with DLM calls.
The Record Management System (RMS) metadata details have to be stored
somewhere. Or reconstituted somehow. Metadata such as that associated
with sequential, relative and indexed files will have to be persisted
in some file system data structure. This gets interesting when
adopting some file systems such as the Microsoft FAT or ExFAT file
systems, ISO-9660:2013, ISO-9660:2017, UDF, or other more portable or
more transportable file systems. The other systems would have to
ignore or to support the RMS metadata, or RMS and RMS-using apps would
have to be modified to remove or rework the related system and
app-related metadata requirements. Maybe BACKUP would then get fixed
here too, assuming BACKUP would not be replaced by an improved and
faster design? For related details here, see the user- and API-visible
portions of the OpenVMS work underlying the undefined file attributes
support added for the ODS-3 and ODS-4 ISO-9660:1988 file system support.
Making metadata dependencies optional in the higher-level software and
in the apps would be very useful, particularly when adopting a plug-in
scheme for file system support. This for kernel support, as well as
support for a FUSE-like file system in user space. Basically, this
means that some volumes might be "RMS immune" or "RMS incompatible", in
practical terms. You get a stream of bytes or a stream of sectors and
both being sequential organizations, or you can get to use your own
record structures on this file system, or you can use an integrated or
add-on database package for your storage.
A host reading and writing the file system and its metadata must also
cause other hosts sharing that storage to manage the local and
now-incoherent cache contents appropriately; cache reloading, cache
eviction and/or wholesale cache flushing; there must be cache
coordination. While SSD storage is much faster than Hard Disk Drive
(HDD) storage, main memory is yet faster still, which means there'll
still be interest in I/O caching in main memory. (A design difference
in the existing OpenVMS caching underlies the Shared Stream I/O (SSIO)
corruptions, too.
http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf )
Host-based Volume Shadowing (HBVS, OpenVMS software RAID-1) is also in
play here, as HBVS has knowledge of and write access to specific parts
of the ODS-2 and ODS-5 file system structures, as well as coordinating
the copy and merge fences when those are active. All of this will
either require integration with the existing file system, or reworking
existing features, or replacing HBVS with ZFS or other features of the
particular new file system. Or the expedient of disallowing these
features with specific new file systems. HBVS sort-of broke the
classic I/O layering model.
IIRC, the StorageWorks RAID (striping, RAID-0) product did not involved
ODS-2 and ODS-5 metadata access, but that would have to be investigated
and mitigated if it did. IIRC, StorageWorks RAID didn't break the
classic I/O layering model, unlike ZFS.
Around ZFS, there can either be changes within ZFS or a layer added
atop ZFS. For example:
https://github.com/ewwhite/zfs-ha/wiki
https://blogs.oracle.com/solaris/cluster-file-system-with-zfs-introduction-and-configuration
Some of the more general high-availability work that's been happening:
https://www.clusterlabs.org/
And as for app changes around adopting VAFS, VSI was claiming that most
of the underlying VAFS changes would be transparent to apps. For apps
to fully adopt VAFS capabilities, required changes will involve
promoting longword sector address and file sector size values to
quadwords in $qio, $io_perform and the RMS services data structures,
and in adopting whatever replaces the XQP File ID (FID) or other bits
of the file system that have been presented. This is pretty much any
app that's directly accessing RMS data structures or performing virtual
or logical $qio or $io_perform operations, and particularly when apps
performing logical I/O are running with files residing on an HDD or SSD
past two gibibytes in total capacity and/or when app using logical or
virtual I/O are addressing files larger than two gibibytes. What other
app changes may or will be required when adopting VAFS, we will learn
as the VAFS release approaches.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list