[Info-vax] Open Source on OpenVMS - A Progress Report

VAXman- at SendSpamHere.ORG VAXman- at SendSpamHere.ORG
Wed Oct 21 06:52:56 EDT 2009


In article <hbmhls$cib$1 at online.de>, helbig at astro.multiCLOTHESvax.de (Phillip Helbig---remove CLOTHES to reply) writes:
>In article
><7efe77bd-4241-4b67-8dbc-85bd5e0a554b at p9g2000vbl.googlegroups.com>,
>MetaEd <metaed at gmail.com> writes: 
>
>> > There is absolutely nothing
>> > saying that this newsgroup or any other group should be
>> > only supporting characters that happens to be in the
>> > *english* alphabet or that it must be 7-bit plain ASCII.
>> 
>> Actually, there is, RFC 1036. This controls the message format for all
>> messages posted to newsgroups. The message format must follow RFC 822
>> with some minor modifications. RFC 822 is limited to ASCII (7-bit
>> codes).
>
>Right.
>
>> Any message composed with characters that cannot be represented with
>> ASCII codes must be stripped of those characters or encoded somehow to
>> ASCII before transmission. The de facto standard for encoding text is
>> RFC 2045--2049 (MIME).
>> 
>> As of this writing, Google Groups does not use MIME when all the
>> characters of the message can be represented with ASCII codes.
>> 
>> Otherwise, if the message can be represented with Latin-1 (ISO-8859-1)
>> codes, Google Groups does so, and encodes with MIME using Quoted-
>> Printable. Because Latin-1 is an ASCII superset, and because Quoted-
>> Printable preserves most ASCII codes, this causes ASCII to be used to
>> encode the message for transmission wherever possible. Other
>> characters are encoded with a hex notation. Long lines are also
>> preserved using a line continuation code. So, despite being encoded,
>> these messages are pretty easy to comprehend using a newsreader that
>> lacks MIME support.
>
>I have an EDT macro which does the decoding (see below).
>
>> But if the message cannot be represented with Latin-1 codes, Google
>> Groups uses UTF-8 codes, and encodes with MIME using Base64. UTF-8 and
>> Base64 are too different from ASCII for such messages to be
>> comprehended easily using a newsreader that lacks MIME support.
>
>One can extract them and run B64DECODE.EXE on them.  However, such 
>messages USUALLY have no place in a newsgroup in the first place.
>
>> The attribution line which Google Groups creates in the body (for
>> example: "On Oct 20, 2:06 pm, MetaEd <met... at gmail.com> wrote")
>> contains a Latin-1 non-breaking space (code A0) between the minutes
>> and the "am" or "pm". This is a character which does not exist in
>> ASCII.
>> 
>> As a courtesy to readers having no MIME support, posters can replace
>> the non-breaking space with a plain space. This will avoid MIME
>> encoding, as long as the message has no other non-ASCII characters.
>
>Good suggestion.
>
>> And, as a courtesy to posters who are spelling names and places
>> properly using non-ASCII codes, readers can learn to read MIME encoded
>> messages or use a newsreader that has MIME support.
>
>Something which breaks the RFC but provides few if any problems for most 
>people, whatever newsreader folks are using, is to use 8-bit characters 
>WITHOUT encoding.  This is analogous to doing so in VMS MAIL (but don't 
>forget to set the transport to 8-bit in the SMTP configuration).  Any 
>newsreader which has fancy features will probably assume ISO-LATIN-1 and 
>get it right, as will many WITHOUT fancy features.  Such codes can be 
>entered from a VMS keyboard with the compose key.  If you have FORTRAN 
>installed, do HELP FORT CHAR DEC to get the DEC multinational set (which 
>is almost ISO-LATIN-1):
>
>          +------------------------------------------+
>          |     8     9      A   B   C   D   E   F   |
>          +---+--------------------------------------+
>          | 0 |       DCS        °   À       à       |
>          | 1 |       PU1    ¡   ±   Á   Ñ   á   ñ   |
>          | 2 |       PU2    ¢   ²   Â   Ò   â   ò   |
>          | 3 |       STS    £   ³   Ã   Ó   ã   ó   |
>          | 4 | IND   CCH            Ä   Ô   ä   ô   |
>          | 5 | NEL   MW     ¥   µ   Å   Õ   å   õ   |
>          | 6 | SSA   SPA        ¶   Æ   Ö   æ   ö   |
>          | 7 | ESA   EPA    §   ·   Ç   ×   ç   ÷   |
>          | 8 | HTS          ¨       È   Ø   è   ø   |
>          | 9 | HTJ          ©   ¹   É   Ù   é   ù   |
>          | A | VTS          ª   º   Ê   Ú   ê   ú   |
>          | B | PLD   CSI    «   »   Ë   Û   ë   û   |
>          | C | PLU   ST         ¼   Ì   Ü   ì   ü   |
>          | D | RI    OSC        ½   Í   Ý   í   ý   |
>          | E | SS2   PM             Î       î       |
>          | F | SS3   APC        ¿   Ï   ß   ï       |
>          +---+--------------------------------------+

This was perfectly readable.  No quoted-pukeable needed!

>
>! quoted printable
>!
>! Create a buffer with two blank lines (for some reason one
>! blank line is not enough???)
>!
>find buffer cr_buffer
>insert;
>insert;
>find last
>!
>DEFINE MACRO KQP
>FIND BUFFER KQP
>INSERT;s|=2c|,|w
>INSERT;s|=FC|ü|w
>INSERT;s|=DF|ß|w
>INSERT;s|=F6|ö|w
>INSERT;s|=E4|ä|w
>INSERT;s|=3D|=|w
>INSERT;s|=A0| |w
>INSERT;s|=91|`|w
>INSERT;s|=92|'|w
>INSERT;s|=5F|_|w
>INSERT;s|=20||w
>INSERT;s|=C4|Ä|w
>INSERT;s|=D6|Ö|w
>INSERT;s|=DC|Ü|w
>INSERT;s|=BA|º|w
>INSERT;s|=95|·|w
>INSERT;s|=2E|.|w
>INSERT;s|=2D|-|w
>INSERT;s|=E9|é|w
>INSERT;s|=E1|á|w
>INSERT;s|=C1|Á|w
>INSERT;s|=E8|è|w
>INSERT;s|=93|<I>|w
>INSERT;s|=94|</I>|w
>INSERT;s|=E5|å|w
>INSERT;s|=96|---|w
>INSERT;s|=20| |w
>! Linefeed/CR Combination
>INSERT;%B
>INSERT;change; 9999('=0A=0D' cutsr paste=cr_buffer) ex
>! CR/Linefeed Combination
>INSERT;%B
>INSERT;change; 9999('=0D=0A' cutsr paste=cr_buffer) ex
>! Linefeed
>INSERT;%B
>INSERT;change; 9999('=0A' cutsr paste=cr_buffer) ex
>! CR
>INSERT;%B
>INSERT;change; 9999('=0D' cutsr paste=cr_buffer) ex
>INSERT;%B
>find last
>

This is fine.  A search and replace for the =## garbage but this does _NOT_
address people's faineant attitude to eschew the carriage return key wherein
quoted-pukeable encoding break-ups the text every 64 characters.  Micro$oft
products seem to embrace the non-blanking space too filling my terminal with
huge blocks of =A0s littered in with the text and =## vomit.
-- 
VAXman- A Bored Certified VMS Kernel Mode Hacker    VAXman(at)TMESIS(dot)ORG

  http://www.quirkfactory.com/popart/asskey/eqn2.png
  
  "Well my son, life is like a beanstalk, isn't it?"



More information about the Info-vax mailing list