[Info-vax] Looking for some text search ideas
Johnny Billquist
bqt at softjar.se
Sat Sep 27 06:44:22 EDT 2014
On 2014-09-27 03:17, David Froble wrote:
> johnwallace4 at yahoo.co.uk wrote:
>
>> Me: Find a way to split the workload across multiple concurrent
>> threads of execution, which then means it can be spread across
>> multiple processors, thereby reducing the time to completion. 4
>> processors => roughly a quarter of the single-processor elapsed time,
>> though the actual effort required will increase slightly... maybe.
>
> I'm not sure I have a clue on how to do such a thing.
>
> Regardless, a small fraction of a second to perform the task, before any
> optimization, is more than adequate.
>
> Frankly, I just don't see any way to optimize a search through data. The
> brute force of just checking all of it sequentially seems the only
> option. But what do I know? Could be, someone has come up with
> something smart that could do a quicker job. And so I throw the
> question out to some smart people.
>
> :-)
Hmm. People show a lack in CS education here. :-)
If you want to search for a substring within a large gob of memory, then
yes, a brute force checking all characters sequentially is definitely
not the best we've come up with.
This was figured out a long time ago.
Essentially, you can do much better. Not that it matters for really
small gobs of memory, but if you're searching through a few megabytes,
then your search can be sped up by orders of magnitude if you are more
clever.
I don't have time to write it down right now, but the basic idea is that
you search forward, but match backwards. And if you don't match, you can
figure out how much you can jump forward, which most of the time is more
than just one character.
If people don't know how to do this, I can write it down in more detail
later.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
More information about the Info-vax
mailing list