[Info-vax] Open Source on OpenVMS - A Progress Report
VAXman- at SendSpamHere.ORG
VAXman- at SendSpamHere.ORG
Wed Oct 21 06:52:56 EDT 2009
In article <hbmhls$cib$1 at online.de>, helbig at astro.multiCLOTHESvax.de (Phillip Helbig---remove CLOTHES to reply) writes:
>In article
><7efe77bd-4241-4b67-8dbc-85bd5e0a554b at p9g2000vbl.googlegroups.com>,
>MetaEd <metaed at gmail.com> writes:
>
>> > There is absolutely nothing
>> > saying that this newsgroup or any other group should be
>> > only supporting characters that happens to be in the
>> > *english* alphabet or that it must be 7-bit plain ASCII.
>>
>> Actually, there is, RFC 1036. This controls the message format for all
>> messages posted to newsgroups. The message format must follow RFC 822
>> with some minor modifications. RFC 822 is limited to ASCII (7-bit
>> codes).
>
>Right.
>
>> Any message composed with characters that cannot be represented with
>> ASCII codes must be stripped of those characters or encoded somehow to
>> ASCII before transmission. The de facto standard for encoding text is
>> RFC 2045--2049 (MIME).
>>
>> As of this writing, Google Groups does not use MIME when all the
>> characters of the message can be represented with ASCII codes.
>>
>> Otherwise, if the message can be represented with Latin-1 (ISO-8859-1)
>> codes, Google Groups does so, and encodes with MIME using Quoted-
>> Printable. Because Latin-1 is an ASCII superset, and because Quoted-
>> Printable preserves most ASCII codes, this causes ASCII to be used to
>> encode the message for transmission wherever possible. Other
>> characters are encoded with a hex notation. Long lines are also
>> preserved using a line continuation code. So, despite being encoded,
>> these messages are pretty easy to comprehend using a newsreader that
>> lacks MIME support.
>
>I have an EDT macro which does the decoding (see below).
>
>> But if the message cannot be represented with Latin-1 codes, Google
>> Groups uses UTF-8 codes, and encodes with MIME using Base64. UTF-8 and
>> Base64 are too different from ASCII for such messages to be
>> comprehended easily using a newsreader that lacks MIME support.
>
>One can extract them and run B64DECODE.EXE on them. However, such
>messages USUALLY have no place in a newsgroup in the first place.
>
>> The attribution line which Google Groups creates in the body (for
>> example: "On Oct 20, 2:06 pm, MetaEd <met... at gmail.com> wrote")
>> contains a Latin-1 non-breaking space (code A0) between the minutes
>> and the "am" or "pm". This is a character which does not exist in
>> ASCII.
>>
>> As a courtesy to readers having no MIME support, posters can replace
>> the non-breaking space with a plain space. This will avoid MIME
>> encoding, as long as the message has no other non-ASCII characters.
>
>Good suggestion.
>
>> And, as a courtesy to posters who are spelling names and places
>> properly using non-ASCII codes, readers can learn to read MIME encoded
>> messages or use a newsreader that has MIME support.
>
>Something which breaks the RFC but provides few if any problems for most
>people, whatever newsreader folks are using, is to use 8-bit characters
>WITHOUT encoding. This is analogous to doing so in VMS MAIL (but don't
>forget to set the transport to 8-bit in the SMTP configuration). Any
>newsreader which has fancy features will probably assume ISO-LATIN-1 and
>get it right, as will many WITHOUT fancy features. Such codes can be
>entered from a VMS keyboard with the compose key. If you have FORTRAN
>installed, do HELP FORT CHAR DEC to get the DEC multinational set (which
>is almost ISO-LATIN-1):
>
> +------------------------------------------+
> | 8 9 A B C D E F |
> +---+--------------------------------------+
> | 0 | DCS ° À à |
> | 1 | PU1 ¡ ± Á Ñ á ñ |
> | 2 | PU2 ¢ ² Â Ò â ò |
> | 3 | STS £ ³ Ã Ó ã ó |
> | 4 | IND CCH Ä Ô ä ô |
> | 5 | NEL MW ¥ µ Å Õ å õ |
> | 6 | SSA SPA ¶ Æ Ö æ ö |
> | 7 | ESA EPA § · Ç × ç ÷ |
> | 8 | HTS ¨ È Ø è ø |
> | 9 | HTJ © ¹ É Ù é ù |
> | A | VTS ª º Ê Ú ê ú |
> | B | PLD CSI « » Ë Û ë û |
> | C | PLU ST ¼ Ì Ü ì ü |
> | D | RI OSC ½ Í Ý í ý |
> | E | SS2 PM Î î |
> | F | SS3 APC ¿ Ï ß ï |
> +---+--------------------------------------+
This was perfectly readable. No quoted-pukeable needed!
>
>! quoted printable
>!
>! Create a buffer with two blank lines (for some reason one
>! blank line is not enough???)
>!
>find buffer cr_buffer
>insert;
>insert;
>find last
>!
>DEFINE MACRO KQP
>FIND BUFFER KQP
>INSERT;s|=2c|,|w
>INSERT;s|=FC|ü|w
>INSERT;s|=DF|ß|w
>INSERT;s|=F6|ö|w
>INSERT;s|=E4|ä|w
>INSERT;s|=3D|=|w
>INSERT;s|=A0| |w
>INSERT;s|=91|`|w
>INSERT;s|=92|'|w
>INSERT;s|=5F|_|w
>INSERT;s|=20||w
>INSERT;s|=C4|Ä|w
>INSERT;s|=D6|Ö|w
>INSERT;s|=DC|Ü|w
>INSERT;s|=BA|º|w
>INSERT;s|=95|·|w
>INSERT;s|=2E|.|w
>INSERT;s|=2D|-|w
>INSERT;s|=E9|é|w
>INSERT;s|=E1|á|w
>INSERT;s|=C1|Á|w
>INSERT;s|=E8|è|w
>INSERT;s|=93|<I>|w
>INSERT;s|=94|</I>|w
>INSERT;s|=E5|å|w
>INSERT;s|=96|---|w
>INSERT;s|=20| |w
>! Linefeed/CR Combination
>INSERT;%B
>INSERT;change; 9999('=0A=0D' cutsr paste=cr_buffer) ex
>! CR/Linefeed Combination
>INSERT;%B
>INSERT;change; 9999('=0D=0A' cutsr paste=cr_buffer) ex
>! Linefeed
>INSERT;%B
>INSERT;change; 9999('=0A' cutsr paste=cr_buffer) ex
>! CR
>INSERT;%B
>INSERT;change; 9999('=0D' cutsr paste=cr_buffer) ex
>INSERT;%B
>find last
>
This is fine. A search and replace for the =## garbage but this does _NOT_
address people's faineant attitude to eschew the carriage return key wherein
quoted-pukeable encoding break-ups the text every 64 characters. Micro$oft
products seem to embrace the non-blanking space too filling my terminal with
huge blocks of =A0s littered in with the text and =## vomit.
--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG
http://www.quirkfactory.com/popart/asskey/eqn2.png
"Well my son, life is like a beanstalk, isn't it?"
More information about the Info-vax
mailing list