[Info-vax] 8-bit characters

Lawrence D’Oliveiro lawrencedo99 at gmail.com
Thu Nov 11 17:45:38 EST 2021


On Friday, November 12, 2021 at 7:53:21 AM UTC+13, Arne Vajhøj wrote:
> <quote> 
> Each Unicode code point is represented directly by a single 32-bit 
> code unit. Because of this, UTF-32 has a one-to-one relationship 
> between encoded character and code unit; it is a fixed-width character 
> encoding form. 
> </quote>

Beware of terminology! What a normal person might call a “character”, they call a “text element”. This is represented by one or more of what they are calling an “encoded character”.

So they are able to call UTF-32/UCS-4 a “fixed-width” encoding, only with reference to “encoded characters”, not actually to “characters”.



More information about the Info-vax mailing list