[Info-vax] Looking for some text search ideas

David Froble davef at tsoft-inc.com
Fri Sep 26 21:34:37 EDT 2014


Craig A. Berry wrote:
> On 9/26/14, 12:27 PM, David Froble wrote:
> 
>> Our applications are not using a RDBMS.
> 
>> A request has come up to be able to find any data which contains some
>> specific text.  An example might be any product description that
>> contains the text "gasket".  Using keys won't help, because the key
>> might be "head gasket".
> 
> I assume from the talk of keys that these are RMS indexed files? Does
> either the target of a search or the unit to be returned when you find
> something ever span record boundaries, e.g.:
> 
> XYZ001This is a news-
> XYZ002worthy message.
> 
> If I search for "newsworthy" should I consider that group of records a
> match and return both records? Should I be able to match a word broken
> across record boundaries? Are all the searches "word" searches with
> clearly defined delimiters and known character sets? Or if I search for
> "sage" should I match "message"?

Nothing so complex.

> I see from a subsequent post that you are just doing INSTR on arrays of
> strings. If that works for you, that's fine. A good regular expression
> engine would run circles around INSTR in both functionality and
> performance. A full text search engine would too, and if the data are
> simple, you could build your own with only moderate trouble that indexed
> words (or characters if you wish) and saved either unique key values or
> RFAs to get from the search string back to the containing record(s).
> 
> 

Not RMS, but similar.  The product file is just records with 50-60 data 
fields.  Primary key is Mfg code + Part #.  Part description is not 
keyed.  No good reason to do so.  Briggs may call a part "Gasket, head" 
while Kohler may call a similar part "Head gasket".  Not worth trying 
for any type of keying.  Thus my conjecture that a brute force pass 
through all the descriptions is about that can be done.

But now you mention a "regular expression engine".  Never heard of such. 
  Guess I need to look up the term to see what it's about.  Maybe time 
for this old dog to learn something new.



More information about the Info-vax mailing list