[Info-vax] Looking for some text search ideas
Johnny Billquist
bqt at softjar.se
Sat Sep 27 15:47:35 EDT 2014
On 2014-09-27 13:15, Jan-Erik Soderholm wrote:
> Johnny Billquist wrote 2014-09-27 12:44:
>> On 2014-09-27 03:17, David Froble wrote:
>>> johnwallace4 at yahoo.co.uk wrote:
>>>
>>>> Me: Find a way to split the workload across multiple concurrent
>>>> threads of execution, which then means it can be spread across
>>>> multiple processors, thereby reducing the time to completion. 4
>>>> processors => roughly a quarter of the single-processor elapsed time,
>>>> though the actual effort required will increase slightly... maybe.
>>>
>>> I'm not sure I have a clue on how to do such a thing.
>>>
>>> Regardless, a small fraction of a second to perform the task, before any
>>> optimization, is more than adequate.
>>>
>>> Frankly, I just don't see any way to optimize a search through data. The
>>> brute force of just checking all of it sequentially seems the only
>>> option. But what do I know? Could be, someone has come up with
>>> something smart that could do a quicker job. And so I throw the
>>> question out to some smart people.
>>>
>>> :-)
>>
>> Hmm. People show a lack in CS education here. :-)
>> If you want to search for a substring within a large gob of memory, then
>> yes, a brute force checking all characters sequentially is definitely not
>> the best we've come up with.
>> This was figured out a long time ago.
>>
>> Essentially, you can do much better. Not that it matters for really small
>> gobs of memory, but if you're searching through a few megabytes, then
>> your
>> search can be sped up by orders of magnitude if you are more clever.
>>
>> I don't have time to write it down right now, but the basic idea is that
>> you search forward, but match backwards. And if you don't match, you can
>> figure out how much you can jump forward, which most of the time is more
>> than just one character.
>>
>> If people don't know how to do this, I can write it down in more
>> detail later.
>>
>> Johnny
>>
>
> Yes, but it would be normal to expect that "clever" search algoritms
> probably are already builtin in library text matching functions
> in whateer tool/language you are using.
I would hope so, but I wouldn't bet my life on it.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
More information about the Info-vax
mailing list