[Info-vax] String Manipulation
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Wed Oct 13 10:32:56 EDT 2021
On 2021-10-13 13:16:40 +0000, Bob Gezelter said:
> On Wednesday, October 13, 2021 at 6:54:58 AM UTC-4, HCorte wrote:
>> Have a string named MILLCON that contain "," as delimiter to split in
>> two substrings have two aproches:
>> 1º using routine ELEMENT from STR$
>> STR$ELEMENT(MESSCON,0,",",MILLCON)
>> STR$ELEMENT(IP_ADDRESS,1,",",MILLCON)
>> 2º using routine INDEX and LEN from LIB$
>> POS_AUX = LIB$INDEX(MILLCON,",")
>> MESSCON = MILLCON(1:POS_AUX-1)
>> IP_ADDRESS = MILLCON(POS_AUX+1:LIB$LEN(MILLCON)-(POS_AUX+1))
>> is either aproach good or is there one bether and if so why?
>
>
> Caution is recommended. This approach is extremely brittle in the face
> of text errors.
>
> Both coding sequences presume that the input string is valid, without
> verifying that fact.
>
> More information concerning the source string would be helpful.
As Bob states... Wider views are better. Then narrow the view as
necessary. Starting out narrow can miss both potential problems and
potential solutions.
With solely what you've posted above, it really doesn't matter which
sequence is chosen, other than for code clarity and maintenance.
if you're doing enough of these that the performance of these
operations really matters, profile one or more of the solutions and see
which meet your requirements, and preferably profile the full code to
see where the wall-clock is going.
If you're doing enough of these, a parser is a better and more robust
approach. Options here can include lib$table_parse or—given the
commas—potentially language-specific wrappers around a libcsv port.
If this is possibly involving CSV as it might appear, CSV itself looks
simple for the myriad edge cases it comprises.
If there's even a whiff of a possibility of untrusted data arising
around here, an app-isolated parser is a much more robust approach.
Parsers are also recommended if there's a whiff of a possibility that
the arriving data formats will be extended or modified. In-line string
parsing in procedural languages is, as Bob mentions, decidedly brittle.
Parsers are a shade harder to write up front (for those that haven't
met lib$table_parse or other such), and vastly easier to maintain and
update and secure. And are less code.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list