[Info-vax] character set translation for language accents
JF Mezei
jfmezei.spamnot at vaxination.ca
Thu Apr 16 18:21:28 EDT 2009
jcwoman1963 at hotmail.com wrote:
> When it sends
> text data through my interface, it's sending the accented characters.
>
> When the data comes into my program on VMS, the accented characters
> have been lost/removed.
If an "é" comes out as an "i" it means that somewhere along the line,
someone performed some surgery on your bytes to have their high order
bit removed. This was common for serial communications on 7 bit links.
If Windows sends its data with the old DOS character set, then the é on
windows doesn't correspond to the é on intl standard ISO-LATIN1 and you
need to have a conversion table.
If Windows sends it as UTF-8, then you will find that any character
above 127 will result in 2 characters being sent (basically an escape
character followed by a character that describes which character to use).
How is the data transmitted ?
There is an optional tool on VMS.
help iconv
But to get half decent conversion tables, you need to install the
<mumble>I18N<mumble> kit that comes on the VMS installation CD/DVD/TK50
You can do a DIR SYS$I18N_ICONV: to see which formats are supplied.
There are just a couple that come by default, but once you install the
optional kits, you get plenty.
The sources don't seem to be available, so while VMS comes with the
iconv compiler to create your own tables, it isn't obvious how to create
one. (I guess those woudl be available somewhere on the net)
More information about the Info-vax
mailing list