[Info-vax] Does OpenVMS Use Unicode?

Sun Jun 12 21:58:12 EDT 2016

Funny story about Unicode: initially it was going to be a 16-bit code, sufficient to cover all the world’s *current* writing systems. Then I guess its architects got ambitious, and decided to add in all the *historical* writing systems as well. So nowadays it is officially a 20-bit code. But who knows how much more it might grow in future?

Meanwhile, back in the early 1990s, Microsoft, Sun and Apple decided it would be forward-looking to build the new text coding into core parts of their respective platforms (Windows NT, Java and the Mac HFS-Plus filesystem, respectively). Having a fixed-length 16-bit code was much preferable to having to deal with variable-length multibyte character sets, right?

So when Unicode 2.0 (I think it was) came out, I imagine there was much wailing and gnashing of teeth among these companies. What was originally a fixed-length “UCS-2” encoding had to be redefined as variable-length “UTF-16”, where characters might be 2 bytes or 4 bytes. So much for never having to deal with variable-length character codes again...

Meanwhile, the Linux kernel folks pretty much ignored the issue. In a pathname, ASCII byte “/” is a path separator, and a NUL byte is the path terminator. All other byte codes are allowed in file/directory names, and are not treated specially. So retroactively decreeing that pathnames shall be interpreted as UTF-8 caused no hardship at all. (Not to the kernel, anyway...)

So, did OpenVMS deal with any of this?