[Info-vax] Still no DIR/SORT_BY_TIME

Sun Aug 16 15:46:00 EDT 2015

On 2015-08-16 18:50:51 +0000, AEF said:

> On Saturday, August 15, 2015 at 11:13:32 AM UTC-4, Stephen Hoffman wrote:
>> Are files even the appropriate container for that?
> 
> So you're saying files could be indexed by what would be their time stamp?

I'm wondering whether your use of files and the file system and 
user-selected labels as the organizational structure for this data 
appropriate, and not some other sort of data storage — database blobs, 
or data that's referenced from a database.   The file system is a 
generic database, and not a database — particularly not on OpenVMS — 
that scales all that well.

> 
>>> I'm migrating a file-transfer system. And there's a real crap-load of files!
>> 
>> If it's typical path, what you're porting was a simple design that was 
>> well suited to its intended environment, and has since been scaled way 
>> past sanity and efficacy, and it's undoubtedly become entwined 
>> throughout.
> 
> We are not porting anything. On the VMS side everything is done with 
> numerous batch jobs, command procedures, and a flat database. The new 
> system was built on the Unix side from scratch at least 2 years ago. 
> New jobs are always set up on this new system. It's like buying a new 
> car. You have something new that is functionally the same, but nothing 
> has been ported.
> 
> The VMS side is command procedure driven. The new system on the Unix 
> side is table driven.
> 
> There's a whole lot to both systems that you don't know. It's taken me 
> weeks to learn. I can't put that all down here and I shouldn't. You 
> don't know enough about the system to be so specific. Trust me. What we 
> need to do is more complicated than you think.

I don't doubt it.   But I am very well aware of how tightly 
"enterprise" applications can and often do twist their own knickers — 
how an installed base and legacy compatibility and incremental 
development interact — and how much large-scale applications systems 
can and do accrete code.  In a manner of consideration, this same 
pattern is how OpenVMS itself got where it is now, too.   In short, a 
simple design that's grown well past what was originally intended, and 
where it's very difficult to make changes.

> The crapload of files is what is being sent about. Customers, including 
> internal departments, send us files we need. We send them files they 
> need. It is these that add up to a huge number. And different customers 
> insist on different protocols. Some files are encrypted. Some need 
> special processing for a variety of reasons. Each filename is in the 
> database with a lot of metadata that tells the scripts what to do with 
> the files and where to send them and where to archive them and where to 
> retain them and what not.

I've done large scale source control — technically not the same as your 
directory server for files, but with some very strong parallels to what 
you're describing.    For something like what you're describing, I 
might well look for ideas from git and other distributed systems, too.  
But — as referenced below — there's probably not a whole lot of time to 
look at the overall design.

> But right now we have a hard deadline. We can redesign or improve it 
> otherhow it later. There may well be a better way to do this. But we 
> don't have time for that now.

Accumulated technical leverage (technical debt) is how you got into 
this situation, but you know that.  Sooner or later that doesn't work 
out, but it's common for the management to have been changed by then.  
That overhaul usually arrives when the design runs out of some critical 
resource and you can't get bigger {servers, network pipes, whatever}, 
and there's no good work-around.

-- 
Pure Personal Opinion | HoffmanLabs LLC