[Info-vax] Calling $CREPRC in COBOL
    Arne Vajhøj 
    arne at vajhoej.dk
       
    Tue Jun 28 08:43:50 EDT 2022
    
    
  
On 6/27/2022 10:46 PM, Craig A. Berry wrote:
> On 6/27/22 9:00 PM, Arne Vajhøj wrote:
>> There are two models for Unicode support.
>>
>> A) UTF-8 internal and UTF-8 external
>>
>> That one is not so difficult to implement.
>>
>> Most existing libraries work.
>>
>> For anything ASCII everything works exactly as before.
>>
>> One need some string function to operate on character index
>> instead of byte index.
>>
>> But not so difficult.
>>
>> Problem is that a lot of string functionality becomes expensive
>> because all use of character indexes become iterations.
>>
>> B) UTF-16 internal and UTF-8 external
>>
>> That one requires a lot of work.
>>
>> Library support.
>>
>> Application changes.
>>
>> But it is efficient.
>>
>> Which is why C/C++, Java and .NET all chose that path.
> 
> I think they made those choices when UCS-2 was current and everyone
> thought a wider fixed-width encoding would be enough.
Yes.
>                                                       UTF-16 needs all
> of the same varying-width handling that UTF-8
In theory yes.
In practice it is common for applications only to support BMP.
>                                                does but uses twice as
> much memory for the most common characters.
Most don't care.
Arne
    
    
More information about the Info-vax
mailing list