[Info-vax] Re; Spiralog, RMS Journaling (was Re: FREESPADRIFT)

Mon Jun 27 09:08:21 EDT 2016

On 2016-06-24 00:18, VAXman- at SendSpamHere.ORG wrote:
> In article <nkhf9i$7s3$1 at Iltempo.Update.UU.SE>, Johnny Billquist <bqt at softjar.se> writes:
>> On 2016-06-23 21:34, VAXman- at SendSpamHere.ORG wrote:
>>> In article <nkhd0k$2us$2 at Iltempo.Update.UU.SE>, Johnny Billquist <bqt at softjar.se> writes:
>>>> There are plenty of ways to design protocols.
>>>> Doing the size after the file will not allow you to have any
>>>> understanding of how much space should be reserved for the file, nor get
>>>> any idea of how far you are from completion.
>>>> But that should not stop you.
>>>
>>> So I can get a silly progression bar graphic or, like on linux, file transfer
>>> time left is 8 mins... 7 mins.. 9 mins... ... 10 secs... 5 secs... 14 secs...
>>> 2 sec., nearly complete...  transfer complete.
>>
>> I was merely pointing out why protocols have been designed the way that
>> they pass the size before the data. You might not enjoy the results, and
>> you are free to design your own protocols. It don't change the existing
>> protocols, and it will not make other people change how they design
>> protocols, since some of them really think there is a benefit in this.
>>
>>> Anyway, there are web servers for VMS and several other TCP/IP protocols for
>>> VMS.  You may need to ask those who have successfully implemented those how to
>>> approach your issue if you don't want to use the RFM=STM approach.  I really
>>> don't see why you think that's wrong.  *ix text files are streams with <CR>s
>>> and <LF>s, and binary files are, IIRC, sent in multiples of a block of some
>>> size (512).
>>
>> Uh? Say what? Everything in TCP/IP is just a stream of bytes. There are
>> no blocks, nothing is sent in any multiple of blocks.
>> (And besides, text files in Unix do not have CF and LF in them. They
>> just have LF. Which is why I was complaining about Unix ftp
>> implementations, which often lies about file size, and sometimes cheat
>> when transferring in text mode. These protocols were not designed by
>> Unix people...)
>
> Perhaps, I'm approachjng this from the wrong direction with you. Tell me what your files you're
> serving look like.  $DIRECTORY/FULL  or better, $ANALYZE/RMS/FDL.

Maybe you are.

And this might also be a good time to make something else clear (again).
This thread was/is about having information about file size in bytes. 
People question why it would ever be needed, and I gave examples on when 
it is used in some protocols. And I have had to deal with this specific 
problem, but not under VMS, but under RSX.

Now, if you are going to try and argue that it's a different story in 
RSX I would have another long argument with you.

Essentially, since I've written a full TCP/IP for RSX, along with both 
ftp client, ftp server, and web server, I have had to explicitly deal 
with this problem for all of the cases, and it's about all type of files.

In ftp, I expect to be able to read any file and transfer it. (Well, 
some files like indexed files, do not make much sense to try and 
transfer either in text or binary mode, but anyway...). So, transferring 
a binary file from Unix to RSX and back, I want the file to look the 
same in the end. Same if I do a RSX->Unix->RSX. With the exception that 
file attributes and other meta data is hard to preserve. That can be 
dealt with, but is a different story. But this means I cannot just store 
data from a Unix system as fixed 512 byte blocks, since that means I 
would have lost the size of the file information as it came from Unix. 
It also means that just storing it all as stream files, while possible, 
is not ideal. Since stream files imply line endings, and Unix binary 
files might not be text lines at all to start with. Now, just because 
there is an LF in there does not mean that this constitutes a record, to 
which I should apply any logic.
My choice here was to store binary files as variable length records with 
no file attributes. So a transfer back will give the same data as I 
received. All good, except, of course, that I do not know the size when 
I'm going to transfer the file.
Text files face similar issues. I receive text files from a Unix system, 
and store them natively on the RSX system, so that I can treat them like 
any other text files. Transferring them back once more do the conversion 
to the network format for text, and then Unix can deal with it in 
whatever way to handle text correctly on the Unix side. All good, except 
I do not have the size information in RSX (Unix also do not have the 
correct size information here, and most Unix implementations of ftp I 
checked do lie about the size in this situation).

For http, the story is similar. I want my http server to be able to 
serve all kind of files that I have in RSX. Be that sequential variable 
length record files, stream files, or fixed length records. In this 
application, the correct size of the transfer is essential, or else the 
protocol breaks. So I check the file attributes for fixed length record 
files, and can do the quick calculation of size from the file metadata 
for those, but have to read through the file in all other cases, to see 
what the file size is. These include various binary files that I have 
fetched over time from the internet and have on my RSX system, such as 
JPEGs and PDF files, which obviously are binary data, and actually have 
sizes that do not fall into a category that fixed size records are 
appropriate for. (Actually, I've discovered that for PDF files, it works 
to add junk at the end of the file with no detrimental effect, so I've 
padded those out to a full block, and use fixed record lengths, making 
PDF files much more efficient to deal with.)
But it also includes all the text files that contain the HTML. I hope 
you know that HTML files are plain text.

So in essence, any web server, or ftp code, in RSX (and VMS) *have* to 
read through some files to get the correct size. And unfortunately, this 
applies specifically to text files, which normally natively are in a 
format that does not make it possible to just calculate the size from 
the available meta data.

So, just because some file formats make it easy to figure out the 
length, does not help, since some other formats still do not, and I have 
to deal with all of them. And even worse, the most common ones, are ones 
for which file size cannot be computed easily.

To sum this up then: I have already looked long at the problem, and the 
fact is that you sometimes would be much helped by having the file size 
in bytes for files. Adding this in RSX is not going to happen (although 
I can definitely see how it could be done). In VMS it might be that it 
could happen, and there are definitely applications that would benefit 
from it. So stop trying to convince me that it's not needed, because I 
have already had to solve the problems that it don't exist, and know the 
pain.

	Johnny