[Info-vax] String Manipulation
Dave Froble
davef at tsoft-inc.com
Wed Oct 13 11:03:47 EDT 2021
On 10/13/2021 10:32 AM, Stephen Hoffman wrote:
> On 2021-10-13 13:16:40 +0000, Bob Gezelter said:
>
>> On Wednesday, October 13, 2021 at 6:54:58 AM UTC-4, HCorte wrote:
>>> Have a string named MILLCON that contain "," as delimiter to split in
>>> two substrings have two aproches:
>>> 1º using routine ELEMENT from STR$
>>> STR$ELEMENT(MESSCON,0,",",MILLCON)
>>> STR$ELEMENT(IP_ADDRESS,1,",",MILLCON)
>>> 2º using routine INDEX and LEN from LIB$
>>> POS_AUX = LIB$INDEX(MILLCON,",")
>>> MESSCON = MILLCON(1:POS_AUX-1)
>>> IP_ADDRESS = MILLCON(POS_AUX+1:LIB$LEN(MILLCON)-(POS_AUX+1))
>>> is either aproach good or is there one bether and if so why?
>>
>>
>> Caution is recommended. This approach is extremely brittle in the face
>> of text errors.
>>
>> Both coding sequences presume that the input string is valid, without
>> verifying that fact.
>>
>> More information concerning the source string would be helpful.
>
> As Bob states... Wider views are better. Then narrow the view as
> necessary. Starting out narrow can miss both potential problems and
> potential solutions.
>
> With solely what you've posted above, it really doesn't matter which
> sequence is chosen, other than for code clarity and maintenance.
>
> if you're doing enough of these that the performance of these operations
> really matters, profile one or more of the solutions and see which meet
> your requirements, and preferably profile the full code to see where the
> wall-clock is going.
>
> If you're doing enough of these, a parser is a better and more robust
> approach. Options here can include lib$table_parse or—given the
> commas—potentially language-specific wrappers around a libcsv port.
>
> If this is possibly involving CSV as it might appear, CSV itself looks
> simple for the myriad edge cases it comprises.
>
> If there's even a whiff of a possibility of untrusted data arising
> around here, an app-isolated parser is a much more robust approach.
>
> Parsers are also recommended if there's a whiff of a possibility that
> the arriving data formats will be extended or modified. In-line string
> parsing in procedural languages is, as Bob mentions, decidedly brittle.
>
> Parsers are a shade harder to write up front (for those that haven't met
> lib$table_parse or other such), and vastly easier to maintain and update
> and secure. And are less code.
Not getting out much, I didn't under either example. Here's what I've
used for like, almost forever ...
100 !************************************************************
! Parse a String
!************************************************************
!
SUB PARSE( STG$ , DELIM$ , FRONT$ , BACK$ )
!
! STG$ - String to parse
! DELIM$ - Delimiter string
! FRONT$ - Segment of string preceeding delimiter
! BACK$ - Segment of string following delimiter
!
!************************************************************
!
OPTION SIZE=( INTEGER WORD , REAL DOUBLE )
Z% = INSTR( 1% , STG$ , DELIM$ ) ! Search for
delimiter
Z% = LEN(STG$)+1% UNLESS Z% ! Not found,
whole string
FRONT$ = LEFT( STG$ , Z%-1% ) ! Preceeding
segment
BACK$ = RIGHT( STG$ , Z%+LEN(DELIM$) ) ! Following
segment
!
SUBEND
Implemented in Basic, of course ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
More information about the Info-vax
mailing list