[Info-vax] Indexed file read question

Wed Nov 18 17:45:23 EST 2020

On Wednesday, November 18, 2020 at 10:23:15 AM UTC-5, abrsvc wrote:
> I was asked the following question and am not 100% sure of the answer, so... 
> 
> Situation: Indexed file with 2 keys, the first key no duplicates, second key no duplicates. For clarity, the first key is a record number and the second key is one of 4 values: A,B,L or blank. 

This was later adjusted by OP - There are duplicates on secondary keys.

> 
> A read is posted using the primary key. The next read uses the secondary key say of value A. Does the second read select the matching record within the context of the primary key or just the first record encountered with a matching secondary key? 
> 
> I believe that the record context remains. Am I correct?

Noop. Incorrect. A keyed record lookup establishes a fresh context.
The 'first' record with that alternate key will be returned every time no matter which record was found by primary key.
Any 'next' read - sequential get - will return the next record by that last used index.

If you wanted the 'next' duplicate value for an alternate index to read by order of the primary (or other key) then as per Hoff once could add a 'segment' to the key bytes essentially de-duplicating it.
The writer clarified that this is not needed, as he just wants to confirm what is happening today.
If this change is ever desired then please know that such change can often, but not always, be made to the file without changing the programs accessing the files.

>> Dave Frobble trivia....

Except for Prolog-1 files, the primary key bytes are actually isolated into the first byte of the record, but yeah it travels with the data.

For alternate keys with duplicates there is indeed an array of pointers for each unique key value.
The pointer are added to end, thus the record which was first added will be returned first, but a CONVERT can and will change that.
The pointers are called RRV's Record Retrieval Vectors which essential equate RFA's : VBN + Record ID number plus a flag byte.
It is a simple list, constraint by the bucket size.
When a bucket fills with pointers (7 bytes each - so a good 1000 per *KB (16 block buckets) a while new bucket is added to the list and started out with the (duplicated) key value and its own array.
I've worked with files having millions of duplicates and thus thousands of continutation buckets.
For such files, a new duplicate insert will require thousands of (cached) reads before a single write can happen... each time. Ouch.

>> Arne's Pascal RMS example.

Nicely done Arne.
Myself I prefer to use DCL to demonstrate just about any RMS problem.
For example here:

$ create tmp.idx/fdl="file; org ind; key 0; seg0_l 4; key 1; dupl yes; seg0_p 5; seg0_l 1"
$ conver/merge tt: tmp.idx
noot A
vuur B
 aap A
mies B
 Exit
$ type tmp.idx
 aap A
mies B
noot A
vuur B
$
$
$ read/ind=0/key=" aap" tmp record
$ show symb record
  RECORD = " aap A"
$ read/ind=1/key="A" tmp record
$ show symb record
  RECORD = "noot A"

Enjoy!
Hein.