[Info-vax] improve performance of /EXCLUDE

Stephen Hoffman seaohveh at hoffmanlabs.invalid
Tue Jan 27 17:06:47 EST 2015


On 2015-01-27 21:53:51 +0000, mcleanjoh at gmail.com said:

> Isn't this mainly a factor of the respective file systems?  Linux/Unix 
> is just a stream of bytes on disk whereas RMS is designed to provide a 
> variety of commercial useful file structures (e.g. Indexed).

If you want to do a file-level search using traditional Unix or VMS 
tools, then yes, you're going to be bound by the throughput of the 
underlying file system and the associated storage.   As for a fast 
search, those searches are calculated and cached ahead of time, and 
vastly faster.  This means you might use a different tool — mdfind, 
rather than find, for instance — but dinking around with a traditional 
file-based search is not something most folks are interested in, once 
they've used a cached, faster search tool.

> It's expecting a lot to believe that searching compressed data in an 
> indexed file will be as fast as a simple pattern match in a stream of 
> bytes.  Also a Linux/Unix search would have fewer overheads when 
> determining the start and end of the record than VMS which has to use 
> the specific record type and formatting information to try to figure 
> out where the record starts and ends.

Again, I'd encourage having a look at how a more modern search tool 
works — the ht:/Dig search tool was ported and available on VMS for a 
while, and was decently speedy.  As is typical with these search tools, 
there's a metadata importer for the various formats, as this greatly 
eases the effort of adding new file formats into the search index.  Got 
some weird new format that VSI has never heard of, or want to allow 
tailored searching of what would otherwise use a generic plug-in for 
the specific file format?  Create a plug-in for the particular format.


-- 
Pure Personal Opinion | HoffmanLabs LLC




More information about the Info-vax mailing list