[Info-vax] RMS record metadata, was: Re: Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)

Sun Jun 19 10:40:34 EDT 2016

On 2016-06-19, David Froble <davef at tsoft-inc.com> wrote:
> Simon Clubley wrote:
>> 
>> IMHO, it depends on the context. If you are doing block level reads,
>> then you count the size of the on-disk metadata in the file size.
>> 
>> If you are doing record level reads, then you do not include the
>> metadata IMHO but you _do_ add in any additional terminator bytes to
>> the length (which might not actually be stored on disk).
>> 
>> You need the former if you are doing an image copy. You need the latter
>> if you want to tell a webserver at which point it should resume the
>> download.
>
> Hmmm ...  Never considered that.  Then again, for most HW, you're probably going 
> to read and write whole blocks anyway.  If I was solving the problem, which I'm 
> not, I do think that I'd re-start on block boundaries.
>

You've missed the point David. HTTP restart mechanisms only know about
what byte offset to start resending the server based copy of the file 
from; they know nothing about the client.

On Windows and Unix, the client byte offset is the same as the server
byte offset so the client just asks for the size of the partial file
as it currently exists on the client and sends that.

On VMS, if that byte stream has been converted to variable length records
during the download, then the client needs to read the _whole_ of the
partial file to determine how much actual file data there is in the
partial file and it needs to add back in the size of any record
terminators which were not written to disk.

Only after you have done all that, can you tell the server which file
position it needs to start resending the file from.

>> Also, IMHO I think the default sequential record type for today's
>> world should be stream (and hence the terminator is also included in
>> the file data.) I do not think it should be variable length records.
>
> I'm still thinking about that one.  Perhaps I'm blinded by past expectations.
>

There are some specialist uses I can think of but I don't see any
advantage for normal sequential files such as program source code
and text files to be variable length records instead of stream
records.

One specialist use I can think of is Fortran carriage control. It's
been a very long time since I've written Fortran code, so I don't
know the answer to this question. How is Fortran carriage control
implemented on a stream only filesystem such as on Unix or Windows ?

Are the carriage control characters actually encoded within the file
data itself ?

Can anyone think of any other uses in today's world for variable
length records (and VFC records) for disk files which can't be handled
by a stream record format ?

There's an argument for binary sequential data because the length
bytes are not part of the data stream, and hence cannot be confused
with it. OTOH, the application generally knows the structure of that
data so it can be stored as stream or fixed length records. This last
part is actually _required_ if you are creating a binary sequential
file for interchange with another system (for example an image file).

>> The problem is that today's protocols are simply not designed to
>> handle the case of metadata buried within the file contents
>> themselves; the people who design this stuff have probably never
>> even encountered that case.
>
> Perhaps we need different designers ???
>
>:-)
>

It's a byte stream world out there David, not a record orientated one
and VMS needs to be able to deal with that efficiently.

Simon.

-- 
Simon Clubley, clubley at remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world