[Info-vax] String Manipulation

Stephen Hoffman seaohveh at hoffmanlabs.invalid
Wed Oct 13 10:32:56 EDT 2021


On 2021-10-13 13:16:40 +0000, Bob Gezelter said:

> On Wednesday, October 13, 2021 at 6:54:58 AM UTC-4, HCorte wrote:
>> Have a string named MILLCON that contain "," as delimiter to split in 
>> two substrings have two aproches:
>>  1º using routine ELEMENT from STR$
>>  STR$ELEMENT(MESSCON,0,",",MILLCON)
>>  STR$ELEMENT(IP_ADDRESS,1,",",MILLCON)
>>  2º using routine INDEX and LEN from LIB$
>>  POS_AUX = LIB$INDEX(MILLCON,",")
>>  MESSCON = MILLCON(1:POS_AUX-1)
>>  IP_ADDRESS = MILLCON(POS_AUX+1:LIB$LEN(MILLCON)-(POS_AUX+1))
>>  is either aproach good or is there one bether and if so why?
> 
> 
> Caution is recommended. This approach is extremely brittle in the face 
> of text errors.
> 
> Both coding sequences presume that the input string is valid, without 
> verifying that fact.
> 
> More information concerning the source string would be helpful.

As Bob states... Wider views are better. Then narrow the view as 
necessary. Starting out narrow can miss both potential problems and 
potential solutions.

With solely what you've posted above, it really doesn't matter which 
sequence is chosen, other than for code clarity and maintenance.

if you're doing enough of these that the performance of these 
operations really matters, profile one or more of the solutions and see 
which meet your requirements, and preferably profile the full code to 
see where the wall-clock is going.

If you're doing enough of these, a parser is a better and more robust 
approach. Options here can include lib$table_parse or—given the 
commas—potentially language-specific wrappers around a libcsv port.

If this is possibly involving CSV as it might appear, CSV itself looks 
simple for the myriad edge cases it comprises.

If there's even a whiff of a possibility of untrusted data arising 
around here, an app-isolated parser is a much more robust approach.

Parsers are also recommended if there's a whiff of a possibility that 
the arriving data formats will be extended or modified. In-line string 
parsing in procedural languages is, as Bob mentions, decidedly brittle.

Parsers are a shade harder to write up front (for those that haven't 
met lib$table_parse or other such), and vastly easier to maintain and 
update and secure. And are less code.





-- 
Pure Personal Opinion | HoffmanLabs LLC 




More information about the Info-vax mailing list