[Info-vax] wrong file format
Arne Vajhøj
arne at vajhoej.dk
Thu Dec 31 16:13:21 EST 2020
On 12/31/2020 3:58 PM, Dirk Munk wrote:
> Bill Gunshannon wrote:
>> On 12/31/20 6:07 AM, Dirk Munk wrote:
>>> Bill Gunshannon wrote:
>>>> On 12/30/20 7:59 AM, Dirk Munk wrote:
>>>>> The problem is that in Unix and Windows land there is no difference
>>>>> between the metadata of a file, and the actual contents of a file.
>>>>> The metadata should define the file and the records in the file,
>>>>> that should be completely separate from the actual data contents of
>>>>> the file.
>>>>
>>>> Can't speak for Windows, but Unix has no meta-data. Unix has only one
>>>> file type, a stream of bytes. Everything else is application layer.
>>>
>>> Which means you don't have a clue about the contents of a file, until
>>> you know the internals of the application.
>>
>> Well, that isn't exactly true. Certain file types do have clues.
>> And, at least under Unix, there is an application that will do a
>> very good job of identifying what the file is. It is even possible
>> to add your own hints if they exist and if you so desire .
>
> Nice, but suppose you have a Cobol compiler on Unix, then it will have
> to set up its own file system with all the files Cobol supports, like
> indexed files. What will that application do with those files? RMS will
> tell you the structure of the file, you don't have to guess it.
RMS will always have the information about the record format.
For index-sequential files RMS will have information about the keys, but
it will not have information about the non-key part (which can actually
be different for different records).
>>>>> Suppose I have a file with binary data, and one byte has the binary
>>>>> (ascii) value of <lf>, then Unix will use it as a record separator,
>>>>> even if it is in the middle of the actual data of that record.
>>>>
>>>> Unix has no records. If you cat the file it will line break at the
>>>> <lf>.
>>>> If you od -c the file it will identify the <lf> as just that.
>>>>
>>>
>>> Wonderful. However, it is clear that in many applications the notion
>>> of a data record is present, and that the <lf> is used as record
>>> separator, even if Unix formally doesn't have records.
>>
>> Again, that is more of a C'ism than a Unix'ism. If I write an
>> application that uses ^M instead of ^J it will work just fine.
>> and, there is no reason why I couldn't have ^J as a valid, non-
>> record terminating character in the file.
>
> Sure you can. But the standard (used for instance by FTP ASCII
> transfers) is <lf>.
The *nix standard is definitely LF.
But most network protocols including FTP use CR LF.
FTP RFC:
<quote>
In accordance with the NVT standard, the <CRLF> sequence
should be used where necessary to denote the end of a line
of text.
</quote>
<quote>
If this division is necessary,
the FTP implementation should use the end-of-line sequence,
<CRLF> for ASCII, or <NL> for EBCDIC text files, as the
delimiter.
</quote>
Arne
More information about the Info-vax
mailing list