[Info-vax] AlphaVM-free emulator with all additional peripheral components
Stephen Hoffman
seaohveh at hoffmanlabs.invalid
Wed Aug 1 10:30:32 EDT 2012
On 2012-08-01 13:56:33 +0000, Bob Koehler said:
> In article <jv98d1$kg2$1 at dont-email.me>, Stephen Hoffman
> <seaohveh at hoffmanlabs.invalid> writes:
>> On 2012-07-31 19:20:41 +0000, Bob Koehler said:
>>
>>> That is 8 bit bytes on a serial line doesn't guarantee the hardware
>>> knows what to do with bytes above 255.
>>
>> The eight-bit setting doesn't guarantee which characters the (usually
>> terminal) device will present to the end-user when an application
>> transmits bytes in the eight-bit 128 to 255 range, either.
>
> Ooops. What I sould have said. PDP-10 are the only systems I've
> worked woith having 9 bit and larger bytes.
Octets did eventually become the common unit of character encoding, but
that took many years.
The UNIVAC 1100 series running EXEC-8 had 36-bit words, with 6-bit
FIELDATA, and 9-bit ASCII character encodings. DEC had a few of its
own odd-ball character encodings, and some of the detritus of that era
(eg: RAD50) still lurks in a few very dark corners of VMS.
As was mentioned earlier, there's simply no way to determine the
character encoding of an arbitrary string. Not without a tag or some
other prearranged identifier, or specific knowledge of where the string
came from. You can assume, and you can guess, and you can guess
incorrectly more often than your users might allow. This also gets
back to the MIME-format discussion, and the UTF-8 encoding discussions.
--
Pure Personal Opinion | HoffmanLabs LLC
More information about the Info-vax
mailing list