[Info-vax] OpenVMS - DCL - Data entry filtering

Mon Mar 30 14:02:31 EDT 2015

On 2015-03-30 17:06:57 +0000, JF Mezei said:

> Here is a question which has always bothered me. Database systems have 
> very opaque storage and indexing. (to me at least).

<https://github.com/rxin/db-readings>

Databases can be opaque, yes.  RMS can be opaque.  Databases can have 
metadata stores.  RMS — unless you wrangle something akin to CDD into 
the discussion, and which is not required — does not have a metadata 
store.

If you want to learn a new database, I'd start with PostgreSQL.  
<http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL>   Or have a look at 
Oracle Rdb, if you're on OpenVMS and are in a position to use that 
package.

Go look and go learn.   Find a local SQL database and use one of the 
GUI access tools.  With Rdb, scrounge up an old copy of InstantSQL.  
With MySQL on OS X, something like Sequel Pro.   With PostgreSQL, maybe 
PSequel or such.   SQLite on OS X has DB Browser for SQLite, and 
probably others.  Go explore.  Key among these is data access, and how 
you can abstract the fields.

> Heck, I don't even know which actual files store the data, index and what not.

There are many books on this topic.

> But if one were to compare RMS with the low level database storage in 
> file(s), would the indexing and buckets etc of RMS end up competitive 
> with how databases organise the data and do record searches/lookups ?

A NoSQL database — such as RMS — works fine, as far as it goes.  Use 
it.   But with RMS, this is the difference between dealing with 
character fields on emulated punched cards — what RMS provides you — 
and using a rather more modern data storage mechanism that deals with 
some of the dreck for you, and that can — can — make upgrades easier 
than what happens with, for instance, the data record fields within 
SYSUAF records, which makes format changes... more difficult.

For what I work with, one of the biggest benefits of using SQL or 
similar over RMS is that you don't have to futz around with the 
metadata and the fields within records within the storage — the 
database largely deals with that for you.  If all you've ever done is 
fields within records, this probably doesn't seem like much.  But just 
the ability to access the data and query the data with SQL — without 
having to write custom tools and/or to create and manage intermediate 
data files — is very handy.

> MySQL seems to have a default file organisation that appears sequential 
> to me. ( so I use a different one which has indexed records where you 
> specify key fields, mostly because this is what I am familiar with)

There are differences among the various databases.  Trade-offs, too.

MySQL has different back-end storage engines.  
<https://en.wikipedia.org/wiki/Comparison_of_MySQL_database_engines>

SQLite stores everything in one file — data and metadata and indexes 
and the rest — and the database file itself is easily transportable 
across different operating systems and different platform endian-ness 
is correctly managed, and the metadata for the database itself is 
accessible via SQL.

> Seems to me tha a sequeintial file organsiation can't be all that 
> efficient when doing database lookups by a name, or record number etc.  
> It is fine when most of your lookups are wildcard searches. But not so 
> much when doing simple loomup by a key such as employee number.

So go use fixed-size fields within fixed-sized punched card records?    
Which optimizations are appropriate for a particular application can 
and will vary, both initially and quite possibly over the life of the 
data.   Conversely, getting rid of the field-level-management dreck 
makes application maintenance and application updates somewhat easier.  
Sometimes the application is optimized for raw performance, so the data 
gets loaded into cache.  Sometimes, there's a performance middle ground 
here and SQLite in-memory database might be used.  There are always 
trade-offs.

-- 
Pure Personal Opinion | HoffmanLabs LLC