[Info-vax] String Manipulation

Wed Oct 13 11:03:47 EDT 2021

On 10/13/2021 10:32 AM, Stephen Hoffman wrote:
> On 2021-10-13 13:16:40 +0000, Bob Gezelter said:
>
>> On Wednesday, October 13, 2021 at 6:54:58 AM UTC-4, HCorte wrote:
>>> Have a string named MILLCON that contain "," as delimiter to split in
>>> two substrings have two aproches:
>>>  1º using routine ELEMENT from STR$
>>>  STR$ELEMENT(MESSCON,0,",",MILLCON)
>>>  STR$ELEMENT(IP_ADDRESS,1,",",MILLCON)
>>>  2º using routine INDEX and LEN from LIB$
>>>  POS_AUX = LIB$INDEX(MILLCON,",")
>>>  MESSCON = MILLCON(1:POS_AUX-1)
>>>  IP_ADDRESS = MILLCON(POS_AUX+1:LIB$LEN(MILLCON)-(POS_AUX+1))
>>>  is either aproach good or is there one bether and if so why?
>>
>>
>> Caution is recommended. This approach is extremely brittle in the face
>> of text errors.
>>
>> Both coding sequences presume that the input string is valid, without
>> verifying that fact.
>>
>> More information concerning the source string would be helpful.
>
> As Bob states... Wider views are better. Then narrow the view as
> necessary. Starting out narrow can miss both potential problems and
> potential solutions.
>
> With solely what you've posted above, it really doesn't matter which
> sequence is chosen, other than for code clarity and maintenance.
>
> if you're doing enough of these that the performance of these operations
> really matters, profile one or more of the solutions and see which meet
> your requirements, and preferably profile the full code to see where the
> wall-clock is going.
>
> If you're doing enough of these, a parser is a better and more robust
> approach. Options here can include lib$table_parse or—given the
> commas—potentially language-specific wrappers around a libcsv port.
>
> If this is possibly involving CSV as it might appear, CSV itself looks
> simple for the myriad edge cases it comprises.
>
> If there's even a whiff of a possibility of untrusted data arising
> around here, an app-isolated parser is a much more robust approach.
>
> Parsers are also recommended if there's a whiff of a possibility that
> the arriving data formats will be extended or modified. In-line string
> parsing in procedural languages is, as Bob mentions, decidedly brittle.
>
> Parsers are a shade harder to write up front (for those that haven't met
> lib$table_parse or other such), and vastly easier to maintain and update
> and secure. And are less code.

Not getting out much, I didn't under either example.  Here's what I've 
used for like, almost forever ...

100     !************************************************************
         !                      Parse a String
         !************************************************************
         !
         SUB PARSE( STG$ , DELIM$ , FRONT$ , BACK$ )
         !
         !       STG$            - String to parse
         !       DELIM$          - Delimiter string
         !       FRONT$          - Segment of string preceeding delimiter
         !       BACK$           - Segment of string following delimiter
         !
         !************************************************************
         !
         OPTION SIZE=( INTEGER WORD , REAL DOUBLE )

         Z% = INSTR( 1% , STG$ , DELIM$ )                !  Search for 
delimiter
         Z% = LEN(STG$)+1% UNLESS Z%                     !  Not found, 
whole string
         FRONT$ = LEFT( STG$ , Z%-1% )                   !  Preceeding 
segment
         BACK$ = RIGHT( STG$ , Z%+LEN(DELIM$) )          !  Following 
segment
                                                         !
         SUBEND

Implemented in Basic, of course ...

-- 
David Froble                       Tel: 724-529-0450
Dave Froble Enterprises, Inc.      E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA  15486