[Info-vax] Streaming a File on OpenVMS with Caché
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Thu Jan 15 00:21:01 EST 2015
On 2015-01-15 02:50:25 +0000, Mack Altman III said:
> The %Stream class within Cache opens the file into memory, which can be
> read and evaluated to later be written to disk.
So you're seeking to write your own record parser. Have fun with that.
Have a look at the file command on Unix — file magic — for what you're
headed toward here. Not an easy problem, on any platform. You're
writing a generic parser, after all. C applications using mmap() to
memory-map the file and then process its contents run into some very
similar difficulties.
> The issue we experience is even with the line
> terminator set to CR,LF the line is broken when the variable length is
> reached. VMS is enforcing a line termination; although, there isn't one
> within the file.
VMS isn't enforcing anything unusual here. VMS has done the same
thing here for the last 35 years, give or take. In short, VMS is
doing what it was told to do.
But as for this particular case — apparently with random file formats
containing some sort of configuration data uploaded from random folks —
there's comparatively little hope for success. On any OS platform.
What would this design do, if handed a PDF file or any of the various
Microsoft Word formats, for instance?
> I consider myself a pretty savvy "Googler", which is why after finding
> nothing about enforcing a file type or modifying it without err I
> reached out to the group.
Might want to look around at how some applications are coded to upload
configuration or crash data to a server, rather than expecting a wide
variety of experienced and inexperienced users from using the same
sequence for transferring files.
It's exceedingly unlikely that anyone can get even advanced and
experienced software engineers with years of experience with {insert
operating system} to be able to all upload the data entirely
consistently, after all. Throw this problem at a mix of experienced
and inexperienced users, and you'll be looking right at the mess you're
already looking at.
When I find myself in a hole, the first goal is to stop digging.
Computers are much better at this sort of drudge work after all, and
can format and can secure and can upload the data appropriately.
Using ftp is often a problem on modern networks as it's about the worst
design possible for traversing client-side and server-side firewalls,
so I'd probably look to operate the configuration upload client data
transfer via https tcp 443 here, as that port is generally open from
most networks, and it's also encrypted. Have the Caché code operating
on that port, and have that code accept the transfer from the client
system directly. No files need be involved, and the only record
structures are those you've created between the client logger and the
server.
With OS X application crashes, the application crashes display a
dialog, and the crash data is shown in the dialog and is optionally
uploaded to Apple. Other mechanisms are possible, including
fully-automatic crash uploads, usually via some sort of an opt-in at
installation or configuration.
> The main problem I experience with both OpenVMS and Cache are they are
> very small in regards to the size of the community of developers out
> there.
So package up a reproducer, and report it to Intersystems. VMS isn't
doing anything it hasn't done for eons, which implies there's either a
documented or undocumented limit in Caché, or there's a bug in Caché,
or there's a bug in the local code. Given how heavily used these
particular code-paths are within VMS and within a very large number of
third-party applications, bugs in these VMS code-paths are unlikely.
Possible, but exceedingly unlikely.
Or try a different application design, and preferably one that does not
require parsing an arbitrary and user-generated file format.
Preferably something that avoids the need to accept a gazillion
different file formats, transferred by ftp or sftp.
Those mmap-based applications on C hit these same sorts of bugs, of
course. Trying to roll your own file parser? It can be really, really
fast for the simple byte-blob that's common with a simple Unix
sequential file, but far more entertaining when presented with — as is
also happening here — arbitrary file formats. Fun times.
> As far as we go with the FTP process is telling them to use ASCII/BIN.
> Again, they're non-IT people so beyond that and they are start to get
> confused.
So why persist with a design that involves the command line, FTP or
sftp file transfers, contending with whatever got uploaded, and the
rest?
Wouldn't the user's difficulty and confusion here imply that the design
might have misjudged the users' abilities and needs? They're paying
to have these problems solved for them after all, and they're looking
not to make more work and more complexity for themselves, after all.
If a problem is intractable — as is the case here — then determine if
the problem can be removed. This problem can be removed, too.
Provide an automatic upload mechanism, either during a crash, or in
response to one of the users here pressing a button or entering a
command; automate the bug logging. This mechanism might be built into
the application that you're supporting (still not sure what you're up
to here), or it might be an application or client that the end user has
to install and invoke. In terms that are close to what you're asking
for here, this is the ftp client and the file enforcement and the file
transfer and the data formatting, all rolled together. This gets ftp
and this whole file format issue completely out of the picture,
completely avoid the file format mess, implement upload security (as
there can be sensitive data in crashes), and (for instance) have a
usually-accessible upload path via HTTPS 443 into a dedicated server,
whether using REST or some other scheme. (I'd probably look to use
libwww and POST the configuration or the crash data.)
As an added bonus to this automated approach — as I've found in many of
cases where I've implemented these uploads — you'll get more and better
crash data, if you go to fully automatic (opt-in) crash and
configuration uploads. Prior to these automatic uploads, I just wasn't
hearing about a number of the crashes that were around.
Then there's that ftp itself is a festering pile of insecurity, exposed
credentials, and firewall incompatibilities.
...Or keep digging around with why Caché is inserting the record
terminators, if they're not already present in the file. Because VMS
isn't doing anything here that it wasn't told to do. Package up a
reproducer, and pass it along to Intersystems for a look.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list