[Info-vax] OpenVMS async I/O, fast vs. slow

Tue Nov 7 21:52:01 EST 2023

On Tuesday, November 7, 2023 at 6:17:28 PM UTC-8, Craig A. Berry wrote:
> On 11/7/23 6:04 PM, Arne Vajhøj wrote: 
> > On 11/7/2023 5:44 PM, Craig A. Berry wrote: 
> >> On 11/6/23 6:18 PM, Arne Vajhøj wrote: 
> >>> On 11/6/2023 6:31 AM, Johnny Billquist wrote: 
> >> 
> >>>> Read ahead is something that the system can easily do both for 
> >>>> normal I/O and memory mapped I/O. It's just a question of 
> >>>> speculative reads assuming some pattern by the program. Most 
> >>>> commonly that you are reading files sequentially from start to finish. 
> >>> 
> >>> $QIO(w) and $IO_PERFORM(W) could. 
> >>> 
> >>> But at least for $QIO(W) then I would be very surprised if it did. It is 
> >>> from before VIOC/XFC so originally it did not have anywhere to 
> >>> store read ahead data. VMS engineering could have changed 
> >>> behavior when VIOC/XFC was introduced. But I doubt it. 
> >> 
> >> Are you saying that you think merely using $QIO bypasses XFC?  If so, 
> >> how do you think SHOW MEM/CACHE can give you "Total QIOs"?  And note 
> >> this from the performance management manual page 72: 
> >> "I/O service can be optimized at the application level by using RMS 
> >> global buffers to share caches among processes. This method offers the 
> >> benefit of satisfying an I/O request without issuing a QIO; whereas 
> >> system memory software cache (that is, XFC) is checked to satisfy QIOs." 
> > 
> > I am sure that $QIO(W) hits XFC. Else there would not be 
> > much point in XFC. 
> > 
> > The question is whether $QIO(W) get more data read into XFC 
> > than asked for by the user. It could but I doubt it.
> Consider what I quoted from the fine manual: "system memory software 
> cache (that is, XFC) is checked to satisfy QIOs." What would it mean to 
> read from cache to "satisfy" a QIO other than there may be more data 
> already in the cache than has been explicitly asked for by the user? 
> 
> If you're talking about snooping to predict what might be read next and 
> cache it, it looks like that's an attribute of the volume and of XFC 
> itself and doesn't have anything to do with the I/O API in use: 
> 
> https://wiki.vmssoftware.com/VCC_READAHEAD

This thread has been very educational. I wasn't thinking through the implications of a virtual block data cache that's integrated with the filesystem. The VSI wiki page for the V8.4-2L3 release does a better job of explaining the benefits of the features than anything else I've found:

https://wiki.vmssoftware.com/V8.4-2L3

"The Extended File Cache (XFC) is a virtual block data cache provided with OpenVMS for Integrity servers. Similar to the Virtual I/O Cache, the XFC is a clusterwide, file system data cache. Both file system data caches are compatible and coexist in the OpenVMS Cluster. The XFC improves I/O performance with the following features that are not available with the virtual I/O cache:

Read-ahead caching
Automatic resizing of the cache
Larger maximum cache size
No limit on the number of closed files that can be cached
Control over the maximum size of I/O that can be cached
Control over whether cache memory is static or dynamic
XFC caching attributes of volume can be dynamically modified eliminating the need to dismount the volume."

Where it gets confusing is there definitely is a per-file aspect to caching, since you can set a file or a directory's attributes to write-through or no caching, and the caching attribute is inherited from its parent directory or a previous version of an existing file. "SET FILE /CACHING_ATTRIBUTE". It looks like the "SET FILE" help page covers all of the interesting attributes that you can set programmatically.

At this point, I don't think I trust anyone's explanation for when to use or not use the RMS, QIO, and IO_PERFORM APIs without benchmarking them. The same for optimal buffer sizes. The interesting thing about libuv as an abstraction layer is that you could write a really fast copy file routine that copied files to sockets, other files, over pipes, etc., and then programs using that library would automatically take advantage, because Windows and Linux added transmit-file-to-network calls in the early days of Web servers, so there's a libuv function to perform that operation.

One feature of Windows that libuv knows about that isn't implemented for UNIX is you can open files with the "UV_FS_O_SEQUENTIAL" and "UV_FS_O_RANDOM" flags, which map to the Win32 "FILE_FLAG_SEQUENTIAL_SCAN" and "FILE_FLAG_RANDOM_ACCESS" flags. I've read about filesystems optimizing to try to detect different usage patterns and perform intelligent lookahead accordingly, but not about programs telling the OS how they intend to access the file.

When those flags were added, hard drives were so much slower than today's SSDs, and PCs had so much less RAM, and none of it to spare, so it must have been even more important for the OS to know when and when not to read data ahead. Today, the bottleneck that people are fighting is the overhead of virtualizing fake SCSI hard drives and fake network cards. Virtio helps somewhat, but if what IBM's been doing with KVM on POWER9/10 is any indication, the future lies in optimizing direct paths from the host's PCIe cards into the guest OS. IBM's mainframes have been doing that for a while with their proprietary stuff that emulates the disk geometry of IBM 3390 disk drives from 1990. But they have super-fast networking and IPC between VMs.