[Info-vax] Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)

Mon Jun 27 10:50:20 EDT 2016

On 2016-06-27 16:14, Johnny Billquist wrote:
> On 2016-06-27 15:33, VAXman- at SendSpamHere.ORG wrote:
>> In article <nkr6sn$put$1 at Iltempo.Update.UU.SE>, Johnny Billquist
>> <bqt at softjar.se> writes:
>>> On 2016-06-24 20:27, VAXman- at SendSpamHere.ORG wrote:
>>>> I've been trying to get Johnny to realize that there are numerous
>>>> ways to
>>>> represent records in a VMS file.
>>>
>>> And you don't have to explain that to me. I know way more of the
>>> internals of these things than I should have to. While not specifically
>>> RMS-32, I've modified RMS-11, ODS-1 and FCS enough to last me a
>>> lifetime, and it's really no different than RMS-32.
>>
>> ... and I know way more of the RMS internals that two lifetimes -- so
>> what?
>
> Right. So let's stop this pissing game.
> Just as I know that you know these things, stop trying to assume that I
> don't.
>
>>>>  In *ix files (text files) there's a <LF>
>>>> at the end of a string of bytes.  VMS can support that and, if he
>>>> were to
>>>> use that for his files, he'd be able to get the sought after byte
>>>> size.  I
>>>> don't see why there's such an inability to comprehend it.
>>>
>>> What I can't comprehend is your inability to understand the problem, or
>>> that having one more piece of metadata actually could help for a rather
>>> common case. We've been running this thread way longer than I ever
>>> though was needed.
>>>
>>> Who cares that VMS can store files in a compatible way with Unix. That
>>> is not the answer. You still have various files, in various formats on
>>> VMS. How it looks under Unix have no bearing.
>>
>> But it does because that's the size of the data you want to present to
>> it!
>
> What is "it" here? I've been pointing out in a number of posts now that
> sometimes having the file size in bytes is useful. And the examples I've
> used have been some internet protocols.
>
> So all that's been in my argument is: VMS, all kind of files. And
> internet protocols, as specified by some RFCs. No Unix within sight.
>
>>> You are making the assumption that the files will be created
>>> specifically for the need and purpose, which is a broken assumption. A
>>> web server is expected to serve content that already exist. And if that
>>> is created by a text editor (not uncommon), it will normally be in the
>>> standard format for a text file, which on VMS would be variable sized
>>> sequential records with implied CRLF.
>>
>> EDT and TPU will edit that file just fine if it's RFM=STM or
>> RFM=STMLF. ;)
>
> Just because it can does not mean that all your files will be.
>
>> So, you have a file that was created with a VMS editor.  Let's, for
>> argument
>> here, say that's VFC.  You want to send that file in HTTP with a byte
>> count
>> that properly accounts for <record's data>+<LF> for each record; yet,
>> you do
>> NOT know how each record has been stored.  That's impossible!  If you
>> can do
>> that, please let me know how as I might like to apply that logic to
>> winning
>> the Powerball.  Short of reading the file, in the case of a VFC, to
>> know the
>> size of EACH record, there's no way to accurately define the file size
>> in a
>> format that you expect for it to be transferred.  You can't be that
>> obtuse!
>
> Nitpick: The correct format is <record's data>+CR+LF.
>
> And... uh... If the file is in VFC, then you have the information on how
> each record is stored. Which part is it that you miss here? We know the
> storage format on disk. We would like to know the number of bytes this
> would be represented as, talking about just the file content, without
> any metadata.
>
> If we were to imagine how a system would implement this, the obvious
> answer is that as each record is written, the byte size added to the
> file would be the length of the record, plus two bytes for the implied
> CR+LF.
> This is not rocket science.
>
> And if you have a file with variable length records without the implied
> CR+LF, then the size added would become just the length of each record
> written. The actual space needed on disk is obviously different than
> this number, for a bunch of reasons and cases. But that's irrelevant. If
> you were to read the file, these are the number of bytes you would get
> (assuming you add your CR+LFs as assumed, at the end of each record, if
> the file attributes said so).

An alternative solution, which I think would actually be even more 
useful, would be to keep track of just the number of bytes actually 
written. So the implied CRLF per record would be ignored.
However, in addition to counting the bytes in each $PUT, you also have a 
counter for the number of records in the file.
If someone then wants the size, assuming you add a CRLF for each record 
(if say, you have implied CRLF as an attribute on the file), then you 
take the number of actual bytes written, and you add 2*<number of 
records> to this.
Easy, more generic, and if someone would like to know the number of 
records for some other use, then that information would also be available.
Pretty cheap, and giving you more things you can do based on the metadata.

Now, would this really be that bad?

	Johnny