[Info-vax] Looking for some text search ideas
Hein RMS van den Heuvel
heinvandenheuvel at gmail.com
Sat Sep 27 11:42:59 EDT 2014
On Friday, September 26, 2014 6:49:22 PM UTC-4, David Froble wrote:
> Hein RMS van den Heuvel wrote:
> > On Friday, September 26, 2014 1:27:04 PM UTC-4, David Froble wrote:
>
> >>> Does anyone know of a more effective method
> > than a sequential pass through the data of searching a list of data
> > looking for text matches?
>
> > Yes. Use 2 (or 10) passes each processing 1/2 (or 1/10) of the data.
>
> I don't see how this would help ??
Well, I managed to leave out the word parallel.
Multiple threads in a single process,
or multiple concurrent requested to multiple available parallel servers.
Admittedly I was also more thinking about very large files where the read time would be the defining factor, but the intro suggested it could just be sucked into memory.
> No file access. Data will be loaded into memory once, and then searched upon request.
May we assume you have protection against changes factored in ?
You indicate an request to a process right. That's fine in general, perfect for a cluster (or set of servers). It also gives the best options to protect against changes, notably serializing while changes/reload is done.
On a single node, you might as well read the data into shared memory and use a subroutine in the users process to do the search. Of course you'd have a common api anyway whether it becomes network/service request, or an inline call should stay transparent to the calling program.
> That was just an example that I seemed to remember from the last time i used SQL.
That's how I read it. But I asked just in case actual SQL access was desirable. Thanks for confirming.
Hein> > For recent OpenVMS versions you can use SEARCH/KEY=(POS=n,SIZ=n) for
John Reagan... yeah, once you played with Regex, even the enhanced search options become rather restrictive.
I find myself bypassing SEARCH more and more in favor of AWK or PERL.
If it was not for the /WINDOW and /LOG option I would never use it again
Instead of search I now mostly use: $ perl -ne "print qq($.\t$_) if /xxxxx/" file. Or when on windows (most of the time :-( ), I use the 'search' script which comes with the active-state perl install).
Cheers,
Hein.
More information about the Info-vax
mailing list