[Info-vax] Portable OpenVMS binary data format?
Louis Krupp
lkrupp at nospam.pssw.com.invalid
Tue Aug 7 02:36:23 EDT 2018
On Mon, 6 Aug 2018 14:48:17 -0700 (PDT), John E <eiler13 at gmail.com>
wrote:
>> Would it be easier to use RECORDTYPE=STREAM (instead of STREAM_LF) and
>> not have to read the extra character with gfortran?
>>
>> This has probably been mentioned, but compiling your OpenVMS FORTRAN
>> programs with /FLOAT=IEEE_FLOAT could be helpful.
>
>Thanks, Louis. No idea but I'll put it on the list of things to try!
My free advice:
1. Start small. Write a short array to a file, ftp it to a Linux
system, and see if you can read it and get the same numbers you
started with. Try compiling with and without /FLOAT=IEEE_FLOAT on
OpenVMS and see if it makes a difference.
2. Keep it simple. My understanding of the various stream record types
comes from some experience and from reading this:
https://software.intel.com/en-us/node/678405
As far as I can tell, STREAM_LF appends a linefeed (0x0a) to each
record, STREAM_CR appends a carriage return (0x0d), STREAM_CRLF
appends a carriage return and a linefeed, and STREAM appends nothing
at all. If you were writing text, it would make sense to append a
linefeed or a carriage return or both, but since you're writing binary
data, I don't believe there's any point in appending anything at all.
As an analogy, let's say you were writing lines of letters and numbers
to a file. If the program that writes the file and the program that
reads the file both know that all of the lines are going to be exactly
five characters long, then you can write these lines:
12345
abcde
99999
zyxwv
and just run them together in the file with no separation:
12345abcde99999zyxwv
and the program reading the file will read everything correctly.
If the lines you're writing might have different lengths, or if you
just want the file to make a little more sense to the casual observer,
then you might want to append a character that isn't a letter or a
number to each line when you write the file. You could pick a comma,
for example:
12345,abcde,99999,zyxwv,
and as long as the program reading the file knew that each line ended
with a comma, you'd get the right stuff back.
STREAM_LF, STREAM_CR and STREAM_CRLF do something similar with lines
of text. The assumption is that text characters are letters, numbers,
spaces and certain "special" characters like commas and semicolons and
so on. These can all be classified as "printable" characters, and
lines of printable characters can be separated only by characters that
*aren't* printable, like linefeeds and carriage returns. These last
are most commonly represented by their hexadecimal codes, namely 0a
and 0d.
(*Every* text character has a hex code: An uppercase 'A' is hex 41, a
lowercase 'a' is hex 61, a comma is hex 2c, and a space is hex 20.)
Carriage returns and linefeeds are called those for historical
reasons, which I'll spare you. The hex representations apply to what's
known as the American Standard Code for Information Interchange, or
ASCII. There's another representation used by certain IBM and Unisys
systems, but I'll spare you those details, too.
What about binary data? When you're writing binary data, you're not
writing characters, printable or otherwise, you're writing arbitrary
bit patterns. If you're writing the integer 65 as one 8-bit byte,
you're going to get hex 41 (i.e., 65 in base 16), which happens to
look like an uppercase 'A'. If you're writing the number 10, you're
going to get hex 0a, which is a carriage return. Basically, there is
no way to tell what's data and what's not, so there's no way to
separate lines, and if you use STREAM_LF or STREAM_CR or STREAM_CRLF,
you're just going to confuse the next person who looks at your
program.
Keep it simple. Just use STREAM, and make sure the program that writes
the file and the program that reads the file agree on how much data is
being written and read.
(You mentioned getting the impression that unformatted files are
essentially memory maps. They're not, but they have their own problems
with portability, and the instinct that tells you avoid them is right
on.)
Louis
More information about the Info-vax
mailing list