[Info-vax] Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)

Mon Jun 27 09:33:09 EDT 2016

In article <nkr6sn$put$1 at Iltempo.Update.UU.SE>, Johnny Billquist <bqt at softjar.se> writes:
>On 2016-06-24 20:27, VAXman- at SendSpamHere.ORG wrote:
>> In article <nkjrcr$4ra$1 at dont-email.me>, Stephen Hoffman <seaohveh at hoffmanlabs.invalid> writes:
>>> On 2016-06-24 16:54:35 +0000,   VAXman-  @SendSpamHere.ORG said:
>>>
>>>> In article <yE7BjmIxxkSS at eisner.encompasserve.org>,
>>>> koehler at eisner.nospam.decuserve.org (Bob Koehler) writes:
>>>>> In article <nkhf9i$7s3$1 at Iltempo.Update.UU.SE>, Johnny Billquist
>>>>> <bqt at softjar.se> writes:
>>>>>>
>>>>>> Uh? Say what? Everything in TCP/IP is just a stream of bytes. There are
>>>>>>  no blocks, nothing is sent in any multiple of blocks. (And besides,
>>>>>> text files in Unix do not have CF and LF in them. They  just have LF.
>>>>>> Which is why I was complaining about Unix ftp  implementations, which
>>>>>> often lies about file size, and sometimes cheat  when transferring in
>>>>>> text mode. These protocols were not designed by  Unix people...)
>>>>>
>>>>> So hwo does UNIX solve it?  By lieing about it?  Does that work anyhow?
>>>>>  If so, then why can't VMS lie about it?  Or do both UNIX and VMS have
>>>>> to read the file twice to get it right?
>>>>
>>>> Maybe you'll get an answer but I'd suggest you don't hold your breath.
>>>
>>> Unix returns the file size in bytes.
>>>
>>> As for TCP presenting a stream and not datagrams, more than a few
>>> neophyte developers have been derailed by that detail.  There's no
>>> one-to-one mapping of write I/O size to read I/O size with TCP.  One
>>> TCP write can produce one read, or potentially as many single byte read
>>> I/O requests as bytes were written.
>>>
>>> For file transfers, the app developer chooses how much data to toss
>>> over the connection.  That might be records from a file or records
>>> synthesized by the network server for a network protocol, or whatever
>>> hunk of data the developer thought was appropriate.  Particularly with
>>> 64-bit addressing and a flat address space, it wouldn't surprise me to
>>> see a few just read and write the whole file.
>>>
>>> For those on systems that don't have to use socket I/O, they'll call
>>> the file transfer framework or whatever the local analog; libssh
>>> underneath ssh has a callable interface.  Though AFAICT, there's no
>>> libssh available with the HPE ssh bits.  There is a libcurl port around
>>> for OpenVMS.  OpenVMS itself never sprouted a local callable copy akin
>>> to macOS and copyfile(3) â€” beyond callable convert which can usually
>>> get you there, and probably callable backup, or probably the FTSV/FTSO
>>> spool layered product bits for those that have access to that â€” though
>>> there was some work on providing that.
>>>
>>> Having a simple call that gets you the user file size would be handy,
>>> at least for stream files and analogous.   Getting the user data size
>>> of a NoSQL, or metadata-enriched RMS file formats, or a relational
>>> database file, is rather less useful, so that'd best be the size of the
>>> whole wad that needs to be transferred.  Whether that's blocks or not
>>> matters little.
>>>
>>> But then this whole block size stuff will get even more interesting
>>> if/when VSI adds support for native access to the two and four kibibyte
>>> sector sizes that are now available.  EFI sees those differences as do
>>> a few other giblets, but most users haven't had to deal with that yet.
>>
>> You don't need to explain that to me.
>>
>> I've been trying to get Johnny to realize that there are numerous ways to
>> represent records in a VMS file.
>
>And you don't have to explain that to me. I know way more of the 
>internals of these things than I should have to. While not specifically 
>RMS-32, I've modified RMS-11, ODS-1 and FCS enough to last me a 
>lifetime, and it's really no different than RMS-32.

... and I know way more of the RMS internals that two lifetimes -- so what?

>And just because there are numerous ways to store a file do not mean 
>that I should support only one of them.
>
>>  In *ix files (text files) there's a <LF>
>> at the end of a string of bytes.  VMS can support that and, if he were to
>> use that for his files, he'd be able to get the sought after byte size.  I
>> don't see why there's such an inability to comprehend it.
>
>What I can't comprehend is your inability to understand the problem, or 
>that having one more piece of metadata actually could help for a rather 
>common case. We've been running this thread way longer than I ever 
>though was needed.
>
>Who cares that VMS can store files in a compatible way with Unix. That 
>is not the answer. You still have various files, in various formats on 
>VMS. How it looks under Unix have no bearing.

But it does because that's the size of the data you want to present to it!

>>  Variable length
>> file have a 2-byte length count prefixing each record -- akin to having a
>> total file byte count for his purposes.  However, that length is figured
>> into the total file byte count and that's NOT appropriate for a protocol
>> that's sending <byte><byte><byte><byte>...<byte><byte><LF>.  Selecting a
>> file format that reflects the data will get him his byte count assuming a
>> <byte><byte><byte><byte>...<byte><byte><LF> transfer protocol.  Ignoring
>> that will only cause him to acquire the proper and true file size, but it
>> will be an incorrect size for his protocol transfers.
>
>You are making the assumption that the files will be created 
>specifically for the need and purpose, which is a broken assumption. A 
>web server is expected to serve content that already exist. And if that 
>is created by a text editor (not uncommon), it will normally be in the 
>standard format for a text file, which on VMS would be variable sized 
>sequential records with implied CRLF.

EDT and TPU will edit that file just fine if it's RFM=STM or RFM=STMLF. ;)

So, you have a file that was created with a VMS editor.  Let's, for argument
here, say that's VFC.  You want to send that file in HTTP with a byte count
that properly accounts for <record's data>+<LF> for each record; yet, you do
NOT know how each record has been stored.  That's impossible!  If you can do
that, please let me know how as I might like to apply that logic to winning
the Powerball.  Short of reading the file, in the case of a VFC, to know the
size of EACH record, there's no way to accurately define the file size in a
format that you expect for it to be transferred.  You can't be that obtuse!

>This is the most common type of files you will be serving. Going on 
>rambling about how it will be easy to figure out the length of a 
>stream-LF file could be more irrelevant.
>
>However, I'm glad you at least acknowledge that getting the plain 
>content size of a sequential record file in VMS is not possible without 
>actually reading through the file. Because this is the problem, and this 
>is what you need to do today.

I *never* acknowledged that!  The file's content size is available without
having to read through the file; however, there are several record formats
available to save that file; thus, the size will be different.  Now, if YOU
believe that a record is <byte><byte><byte>...<byte><byte><LF>, I might NOT
and their record formats affirm that.

What you want is the byte count size that's based upon what or how you will
normalize the data in its transfer.  That would be misleading to those of us
looking at that the file as it's stored on the media.  There is VMS CONVERT 
which can modify the record format; each target format can add or subtract
to and from the size of the input.  Which is correct?  You haven't YET said
which one.

-- 
VAXman- A Bored Certified VMS Kernel Mode Hacker    VAXman(at)TMESIS(dot)ORG

I speak to machines with the voice of humanity.