[Info-vax] Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)
Johnny Billquist
bqt at softjar.se
Mon Jun 27 09:08:21 EDT 2016
On 2016-06-24 00:18, VAXman- at SendSpamHere.ORG wrote:
> In article <nkhf9i$7s3$1 at Iltempo.Update.UU.SE>, Johnny Billquist <bqt at softjar.se> writes:
>> On 2016-06-23 21:34, VAXman- at SendSpamHere.ORG wrote:
>>> In article <nkhd0k$2us$2 at Iltempo.Update.UU.SE>, Johnny Billquist <bqt at softjar.se> writes:
>>>> There are plenty of ways to design protocols.
>>>> Doing the size after the file will not allow you to have any
>>>> understanding of how much space should be reserved for the file, nor get
>>>> any idea of how far you are from completion.
>>>> But that should not stop you.
>>>
>>> So I can get a silly progression bar graphic or, like on linux, file transfer
>>> time left is 8 mins... 7 mins.. 9 mins... ... 10 secs... 5 secs... 14 secs...
>>> 2 sec., nearly complete... transfer complete.
>>
>> I was merely pointing out why protocols have been designed the way that
>> they pass the size before the data. You might not enjoy the results, and
>> you are free to design your own protocols. It don't change the existing
>> protocols, and it will not make other people change how they design
>> protocols, since some of them really think there is a benefit in this.
>>
>>> Anyway, there are web servers for VMS and several other TCP/IP protocols for
>>> VMS. You may need to ask those who have successfully implemented those how to
>>> approach your issue if you don't want to use the RFM=STM approach. I really
>>> don't see why you think that's wrong. *ix text files are streams with <CR>s
>>> and <LF>s, and binary files are, IIRC, sent in multiples of a block of some
>>> size (512).
>>
>> Uh? Say what? Everything in TCP/IP is just a stream of bytes. There are
>> no blocks, nothing is sent in any multiple of blocks.
>> (And besides, text files in Unix do not have CF and LF in them. They
>> just have LF. Which is why I was complaining about Unix ftp
>> implementations, which often lies about file size, and sometimes cheat
>> when transferring in text mode. These protocols were not designed by
>> Unix people...)
>
> Perhaps, I'm approachjng this from the wrong direction with you. Tell me what your files you're
> serving look like. $DIRECTORY/FULL or better, $ANALYZE/RMS/FDL.
Maybe you are.
And this might also be a good time to make something else clear (again).
This thread was/is about having information about file size in bytes.
People question why it would ever be needed, and I gave examples on when
it is used in some protocols. And I have had to deal with this specific
problem, but not under VMS, but under RSX.
Now, if you are going to try and argue that it's a different story in
RSX I would have another long argument with you.
Essentially, since I've written a full TCP/IP for RSX, along with both
ftp client, ftp server, and web server, I have had to explicitly deal
with this problem for all of the cases, and it's about all type of files.
In ftp, I expect to be able to read any file and transfer it. (Well,
some files like indexed files, do not make much sense to try and
transfer either in text or binary mode, but anyway...). So, transferring
a binary file from Unix to RSX and back, I want the file to look the
same in the end. Same if I do a RSX->Unix->RSX. With the exception that
file attributes and other meta data is hard to preserve. That can be
dealt with, but is a different story. But this means I cannot just store
data from a Unix system as fixed 512 byte blocks, since that means I
would have lost the size of the file information as it came from Unix.
It also means that just storing it all as stream files, while possible,
is not ideal. Since stream files imply line endings, and Unix binary
files might not be text lines at all to start with. Now, just because
there is an LF in there does not mean that this constitutes a record, to
which I should apply any logic.
My choice here was to store binary files as variable length records with
no file attributes. So a transfer back will give the same data as I
received. All good, except, of course, that I do not know the size when
I'm going to transfer the file.
Text files face similar issues. I receive text files from a Unix system,
and store them natively on the RSX system, so that I can treat them like
any other text files. Transferring them back once more do the conversion
to the network format for text, and then Unix can deal with it in
whatever way to handle text correctly on the Unix side. All good, except
I do not have the size information in RSX (Unix also do not have the
correct size information here, and most Unix implementations of ftp I
checked do lie about the size in this situation).
For http, the story is similar. I want my http server to be able to
serve all kind of files that I have in RSX. Be that sequential variable
length record files, stream files, or fixed length records. In this
application, the correct size of the transfer is essential, or else the
protocol breaks. So I check the file attributes for fixed length record
files, and can do the quick calculation of size from the file metadata
for those, but have to read through the file in all other cases, to see
what the file size is. These include various binary files that I have
fetched over time from the internet and have on my RSX system, such as
JPEGs and PDF files, which obviously are binary data, and actually have
sizes that do not fall into a category that fixed size records are
appropriate for. (Actually, I've discovered that for PDF files, it works
to add junk at the end of the file with no detrimental effect, so I've
padded those out to a full block, and use fixed record lengths, making
PDF files much more efficient to deal with.)
But it also includes all the text files that contain the HTML. I hope
you know that HTML files are plain text.
So in essence, any web server, or ftp code, in RSX (and VMS) *have* to
read through some files to get the correct size. And unfortunately, this
applies specifically to text files, which normally natively are in a
format that does not make it possible to just calculate the size from
the available meta data.
So, just because some file formats make it easy to figure out the
length, does not help, since some other formats still do not, and I have
to deal with all of them. And even worse, the most common ones, are ones
for which file size cannot be computed easily.
To sum this up then: I have already looked long at the problem, and the
fact is that you sometimes would be much helped by having the file size
in bytes for files. Adding this in RSX is not going to happen (although
I can definitely see how it could be done). In VMS it might be that it
could happen, and there are definitely applications that would benefit
from it. So stop trying to convince me that it's not needed, because I
have already had to solve the problems that it don't exist, and know the
pain.
Johnny
More information about the Info-vax
mailing list