[Info-vax] Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)

Fri Jun 24 14:27:28 EDT 2016

In article <nkjrcr$4ra$1 at dont-email.me>, Stephen Hoffman <seaohveh at hoffmanlabs.invalid> writes:
>On 2016-06-24 16:54:35 +0000,   VAXman-  @SendSpamHere.ORG said:
>
>> In article <yE7BjmIxxkSS at eisner.encompasserve.org>, 
>> koehler at eisner.nospam.decuserve.org (Bob Koehler) writes:
>>> In article <nkhf9i$7s3$1 at Iltempo.Update.UU.SE>, Johnny Billquist 
>>> <bqt at softjar.se> writes:
>>>> 
>>>> Uh? Say what? Everything in TCP/IP is just a stream of bytes. There are 
>>>>  no blocks, nothing is sent in any multiple of blocks. (And besides, 
>>>> text files in Unix do not have CF and LF in them. They  just have LF. 
>>>> Which is why I was complaining about Unix ftp  implementations, which 
>>>> often lies about file size, and sometimes cheat  when transferring in 
>>>> text mode. These protocols were not designed by  Unix people...)
>>> 
>>> So hwo does UNIX solve it?  By lieing about it?  Does that work anyhow? 
>>>  If so, then why can't VMS lie about it?  Or do both UNIX and VMS have 
>>> to read the file twice to get it right?
>> 
>> Maybe you'll get an answer but I'd suggest you don't hold your breath.
>
>Unix returns the file size in bytes.
>
>As for TCP presenting a stream and not datagrams, more than a few 
>neophyte developers have been derailed by that detail.  There's no 
>one-to-one mapping of write I/O size to read I/O size with TCP.  One 
>TCP write can produce one read, or potentially as many single byte read 
>I/O requests as bytes were written.
>
>For file transfers, the app developer chooses how much data to toss 
>over the connection.  That might be records from a file or records 
>synthesized by the network server for a network protocol, or whatever 
>hunk of data the developer thought was appropriate.  Particularly with 
>64-bit addressing and a flat address space, it wouldn't surprise me to 
>see a few just read and write the whole file.
>
>For those on systems that don't have to use socket I/O, they'll call 
>the file transfer framework or whatever the local analog; libssh 
>underneath ssh has a callable interface.  Though AFAICT, there's no 
>libssh available with the HPE ssh bits.  There is a libcurl port around 
>for OpenVMS.  OpenVMS itself never sprouted a local callable copy akin 
>to macOS and copyfile(3) â€” beyond callable convert which can usually 
>get you there, and probably callable backup, or probably the FTSV/FTSO 
>spool layered product bits for those that have access to that â€” though 
>there was some work on providing that.
>
>Having a simple call that gets you the user file size would be handy, 
>at least for stream files and analogous.   Getting the user data size 
>of a NoSQL, or metadata-enriched RMS file formats, or a relational 
>database file, is rather less useful, so that'd best be the size of the 
>whole wad that needs to be transferred.  Whether that's blocks or not 
>matters little.
>
>But then this whole block size stuff will get even more interesting 
>if/when VSI adds support for native access to the two and four kibibyte 
>sector sizes that are now available.  EFI sees those differences as do 
>a few other giblets, but most users haven't had to deal with that yet.

You don't need to explain that to me.

I've been trying to get Johnny to realize that there are numerous ways to
represent records in a VMS file.  In *ix files (text files) there's a <LF>
at the end of a string of bytes.  VMS can support that and, if he were to
use that for his files, he'd be able to get the sought after byte size.  I
don't see why there's such an inability to comprehend it.  Variable length
file have a 2-byte length count prefixing each record -- akin to having a
total file byte count for his purposes.  However, that length is figured
into the total file byte count and that's NOT appropriate for a protocol
that's sending <byte><byte><byte><byte>...<byte><byte><LF>.  Selecting a
file format that reflects the data will get him his byte count assuming a
<byte><byte><byte><byte>...<byte><byte><LF> transfer protocol.  Ignoring
that will only cause him to acquire the proper and true file size, but it
will be an incorrect size for his protocol transfers.

I haven't looked at the code for c$stat() on VMS, which can return a file
byte count, to see if there's logic inherent in that code to a return file
size biased to a *ix stream LF file.  I'd wager it's the <end_of_file_block
-1>*512+<end_of_file_byte> computation I've been discussing because that'd
be just too much to handle if the file was binary.  C$stat() shouldn't be
making assumptions about the file content.

Anyway, this whole thread has gone on all too long.  Sometimes, no matter
how bright the light, the blind still refuse to see it.

-- 
VAXman- A Bored Certified VMS Kernel Mode Hacker    VAXman(at)TMESIS(dot)ORG

I speak to machines with the voice of humanity.