[Info-vax] 8-bit characters
Lawrence D’Oliveiro
lawrencedo99 at gmail.com
Thu Nov 11 17:45:38 EST 2021
On Friday, November 12, 2021 at 7:53:21 AM UTC+13, Arne Vajhøj wrote:
> <quote>
> Each Unicode code point is represented directly by a single 32-bit
> code unit. Because of this, UTF-32 has a one-to-one relationship
> between encoded character and code unit; it is a fixed-width character
> encoding form.
> </quote>
Beware of terminology! What a normal person might call a “character”, they call a “text element”. This is represented by one or more of what they are calling an “encoded character”.
So they are able to call UTF-32/UCS-4 a “fixed-width” encoding, only with reference to “encoded characters”, not actually to “characters”.
More information about the Info-vax
mailing list