[Info-vax] Databases versus RMS
Jan-Erik Soderholm
jan-erik.soderholm at telia.com
Thu Apr 19 12:29:55 EDT 2012
Paul Sture wrote 2012-04-19 16:42:
> On Wed, 18 Apr 2012 08:32:55 -0500, Bob Koehler wrote:
>
>> In article<4f8dbbf4$0$24585$c3e8da3$e408f015 at news.astraweb.com>, JF
>> Mezei<jfmezei.spamnot at vaxination.ca> writes:
>>>
>>> Are database engines so fragile that a power failure will truly wreak
>>> havok on a database requiring time consuming work and debugging ?
>>
>> From time to time, things that should work, like databases, or
>> filesystems, don't. Generally vendors will fix things real quick. I
>> once spec'd IBM AIX in part because I assumed that IBM would provide
>> a bug free filesystem. Boy was I suprized when I had to manually run
>> fsck after every power outage. I was not suprized when the next
>> minor version release fixed it.
>
> fsck on Linux will also run every n restarts. On my first Linux system a
> dozen years ago, n was something like 6. Since it was a dual boot system
> I could easily get there several times a week. It took so long with the
> hardware I was using that it was time to put the kettle on. And possibly
> make a sandwich too.
>
>> "database" could also be a collection of data, without an actual
>> DBMS.
>>
>> But it's easy to buy a DBMS that does work, and not uses it
>> correctly.
>
> I had a clear demonstration of that with some trial personal accounting
> software for Macs, which was so slow as to be unusable. I know I wasn't
> running it on the latest and greatest hardware, but that same machine was
> quite capable of running industrial strength commercial accounting
> packages.
>
> It turned out that this software was using PostgreSQL, and provided dumps
> in SQL format (plus marks for that). OTOH inspection of the SQL revealed
> countless indexes which only ever contained one of 2 or 3 values.
>
> Sorry, but I learn that indexes with only 2 values such as "M" and "F"
> were bad news back in RMS days. I'm pretty sure the RMS documentation
> included that as an example of things not to do.
I sure RMS doesnt handle that in a good way.
Rdb handles this quite nicely today using "sorted ranked" indexes.
SQL> select count(*) from <table>;
2957357
1 row selected
SQL> show index <table>_INDEX_4;
Indexes on table <table>:
<table>_INDEX_4 with column <f1>
and column <f2>
Duplicates are allowed
Type is Sorted Ranked
Duplicates are Compressed Bitmaps
Key suffix compression is DISABLED
Node size 430
SQL>
Both <f1> and <f2> are CHAR(1) fields.
SQL> set trans read only;
SQL> select <f1>, <f2>, count(*) from <table> group by <f1>, <f2>
<f1> <f2>
0 9928
1 5
0 0 561157
0 1 101
A 0 192 <<==
A 1 2329898
P 0 55871
P 1 132
S 0 73
9 rows selected
SQL>
Each 15 min there is a job looking for rows with the "A"/"0" value pair.
Runs in below 2 sec (666 MHz DS20e) with this index.
And the total size of the index is (since it uses "compressed bitmap")
small, less then *one byte* total per indexed row in the table.
Aprox 2.4 MB index space for 2.9 million indexed rows.
Yes, probably a bad design to start with, but that happend
back inthe 90's... :-)
Jan-Erik.
More information about the Info-vax
mailing list