[Info-vax] VMS Basic strings class D vs class S
    Stephen Hoffman 
    seaohveh at hoffmanlabs.invalid
       
    Wed Mar  6 18:44:29 EST 2024
    
    
  
On 2024-02-28 14:38:03 +0000, Arne Vajhøj said:
> On 2/26/2024 4:17 PM, Stephen Hoffman wrote:
> 
>> I also wouldn't expect the RTLs to work with encodings other than ASCII 
>> and DEC MCS, either. And UTF-8 will fail in the expected places, and 
>> most searching and sorting tends not to be sensitive to the (written) 
>> language used within the text string.
> 
> I would assume that it works as long as the string is considered a 
> sequence of bytes not a sequence of characters.
The assumption that one byte is one character is embedded deeply in 
OpenVMS system and app code and APIs.
I would assume that such code will break in various ways when presented 
with UTF-8.
Anything assuming a correspondence between string length and displayed 
width is going to fail, for instance.
That's before discussing sorting and searching and language 
differences, as was mentioned. And normalization.
OpenVMS has (had) support some of those differences with NCS and with 
ICU, though those APIs aren't (weren't) widely used by apps.
-- 
Pure Personal Opinion | HoffmanLabs LLC 
    
    
More information about the Info-vax
mailing list