[Info-vax] Streaming a File on OpenVMS with Caché

Stephen Hoffman seaohveh at hoffmanlabs.invalid
Thu Jan 15 00:21:01 EST 2015


On 2015-01-15 02:50:25 +0000, Mack Altman III said:

> The %Stream class within Cache opens the file into memory, which can be 
> read and evaluated to later be written to disk.

So you're seeking to write your own record parser.  Have fun with that. 
 Have a look at the file command on Unix — file magic — for what you're 
headed toward here.  Not an easy problem, on any platform.  You're 
writing a generic parser, after all.   C applications using mmap() to 
memory-map the file and then process its contents run into some very 
similar difficulties.

> The issue we experience is even with the line
> terminator set to CR,LF the line is broken when the variable length is 
> reached. VMS is enforcing a line termination; although, there isn't one 
> within the file.

VMS isn't enforcing anything unusual here.    VMS has done the same 
thing here for the last 35 years, give or take.    In short, VMS is 
doing what it was told to do.

But as for this particular case — apparently with random file formats 
containing some sort of configuration data uploaded from random folks — 
there's comparatively little hope for success.  On any OS platform.  
What would this design do, if handed a PDF file or any of the various 
Microsoft Word formats, for instance?

> I consider myself a pretty savvy "Googler", which is why after finding 
> nothing about enforcing a file type or modifying it without err I 
> reached out to the group.

Might want to look around at how some applications are coded to upload 
configuration or crash data to a server, rather than expecting a wide 
variety of experienced and inexperienced users from using the same 
sequence for transferring files.

It's exceedingly unlikely that anyone can get even advanced and 
experienced software engineers with years of experience with {insert 
operating system} to be able to all upload the data entirely 
consistently, after all.    Throw this problem at a mix of experienced 
and inexperienced users, and you'll be looking right at the mess you're 
already looking at.

When I find myself in a hole, the first goal is to stop digging.

Computers are much better at this sort of drudge work after all, and 
can format and can secure and can upload the data appropriately.   
Using ftp is often a problem on modern networks as it's about the worst 
design possible for traversing client-side and server-side firewalls, 
so I'd probably look to operate the configuration upload client data 
transfer via https tcp 443 here, as that port is generally open from 
most networks, and it's also encrypted.  Have the Caché code operating 
on that port, and have that code accept the transfer from the client 
system directly.  No files need be involved, and the only record 
structures are those you've created between the client logger and the 
server.

With OS X application crashes, the application crashes display a 
dialog, and the crash data is shown in the dialog and is optionally 
uploaded to Apple.   Other mechanisms are possible, including 
fully-automatic crash uploads, usually via some sort of an opt-in at 
installation or configuration.


> The main problem I experience with both OpenVMS and Cache are they are 
> very small in regards to the size of the community of developers out 
> there.

So package up a reproducer, and report it to Intersystems.   VMS isn't 
doing anything it hasn't done for eons, which implies there's either a 
documented or undocumented limit in Caché, or there's a bug in Caché, 
or there's a bug in the local code.   Given how heavily used these 
particular code-paths are within VMS and within a very large number of 
third-party applications, bugs in these VMS code-paths are unlikely.  
Possible, but exceedingly unlikely.

Or try a different application design, and preferably one that does not 
require parsing an arbitrary and user-generated file format.   
Preferably something that avoids the need to accept a gazillion 
different file formats, transferred by ftp or sftp.

Those mmap-based applications on C hit these same sorts of bugs, of 
course.  Trying to roll your own file parser?  It can be really, really 
fast for the simple byte-blob that's common with a simple Unix 
sequential file, but far more entertaining when presented with — as is 
also happening here — arbitrary file formats.   Fun times.

> As far as we go with the FTP process is telling them to use ASCII/BIN. 
> Again, they're non-IT people so beyond that and they are start to get 
> confused.

So why persist with a design that involves the command line, FTP or 
sftp file transfers, contending with whatever got uploaded, and the 
rest?

Wouldn't the user's difficulty and confusion here imply that the design 
might have misjudged the users' abilities and needs?   They're paying 
to have these problems solved for them after all, and they're looking 
not to make more work and more complexity for themselves, after all.

If a problem is intractable — as is the case here — then determine if 
the problem can be removed.  This problem can be removed, too.

Provide an automatic upload mechanism, either during a crash, or in 
response to one of the users here pressing a button or entering a 
command; automate the bug logging.  This mechanism might be built into 
the application that you're supporting (still not sure what you're up 
to here), or it might be an application or client that the end user has 
to install and invoke.   In terms that are close to what you're asking 
for here, this is the ftp client and the file enforcement and the file 
transfer and the data formatting, all rolled together.   This gets ftp 
and this whole file format issue completely out of the picture, 
completely avoid the file format mess, implement upload security (as 
there can be sensitive data in crashes), and (for instance) have a 
usually-accessible upload path via HTTPS 443 into a dedicated server, 
whether using REST or some other scheme.   (I'd probably look to use 
libwww and POST the configuration or the crash data.)

As an added bonus to this automated approach — as I've found in many of 
cases where I've implemented these uploads — you'll get more and better 
crash data, if you go to fully automatic (opt-in) crash and 
configuration uploads.  Prior to these automatic uploads, I just wasn't 
hearing about a number of the crashes that were around.

Then there's that ftp itself is a festering pile of insecurity, exposed 
credentials, and firewall incompatibilities.

...Or keep digging around with why Caché is inserting the record 
terminators, if they're not already present in the file.  Because VMS 
isn't doing anything here that it wasn't told to do.  Package up a 
reproducer, and pass it along to Intersystems for a look.




-- 
Pure Personal Opinion | HoffmanLabs LLC




More information about the Info-vax mailing list