[Info-vax] Open Source on OpenVMS - A Progress Report
Phillip Helbig---remove CLOTHES to reply
helbig at astro.multiCLOTHESvax.de
Wed Oct 21 04:44:44 EDT 2009
In article
<7efe77bd-4241-4b67-8dbc-85bd5e0a554b at p9g2000vbl.googlegroups.com>,
MetaEd <metaed at gmail.com> writes:
> > There is absolutely nothing
> > saying that this newsgroup or any other group should be
> > only supporting characters that happens to be in the
> > *english* alphabet or that it must be 7-bit plain ASCII.
>
> Actually, there is, RFC 1036. This controls the message format for all
> messages posted to newsgroups. The message format must follow RFC 822
> with some minor modifications. RFC 822 is limited to ASCII (7-bit
> codes).
Right.
> Any message composed with characters that cannot be represented with
> ASCII codes must be stripped of those characters or encoded somehow to
> ASCII before transmission. The de facto standard for encoding text is
> RFC 2045--2049 (MIME).
>
> As of this writing, Google Groups does not use MIME when all the
> characters of the message can be represented with ASCII codes.
>
> Otherwise, if the message can be represented with Latin-1 (ISO-8859-1)
> codes, Google Groups does so, and encodes with MIME using Quoted-
> Printable. Because Latin-1 is an ASCII superset, and because Quoted-
> Printable preserves most ASCII codes, this causes ASCII to be used to
> encode the message for transmission wherever possible. Other
> characters are encoded with a hex notation. Long lines are also
> preserved using a line continuation code. So, despite being encoded,
> these messages are pretty easy to comprehend using a newsreader that
> lacks MIME support.
I have an EDT macro which does the decoding (see below).
> But if the message cannot be represented with Latin-1 codes, Google
> Groups uses UTF-8 codes, and encodes with MIME using Base64. UTF-8 and
> Base64 are too different from ASCII for such messages to be
> comprehended easily using a newsreader that lacks MIME support.
One can extract them and run B64DECODE.EXE on them. However, such
messages USUALLY have no place in a newsgroup in the first place.
> The attribution line which Google Groups creates in the body (for
> example: "On Oct 20, 2:06 pm, MetaEd <met... at gmail.com> wrote")
> contains a Latin-1 non-breaking space (code A0) between the minutes
> and the "am" or "pm". This is a character which does not exist in
> ASCII.
>
> As a courtesy to readers having no MIME support, posters can replace
> the non-breaking space with a plain space. This will avoid MIME
> encoding, as long as the message has no other non-ASCII characters.
Good suggestion.
> And, as a courtesy to posters who are spelling names and places
> properly using non-ASCII codes, readers can learn to read MIME encoded
> messages or use a newsreader that has MIME support.
Something which breaks the RFC but provides few if any problems for most
people, whatever newsreader folks are using, is to use 8-bit characters
WITHOUT encoding. This is analogous to doing so in VMS MAIL (but don't
forget to set the transport to 8-bit in the SMTP configuration). Any
newsreader which has fancy features will probably assume ISO-LATIN-1 and
get it right, as will many WITHOUT fancy features. Such codes can be
entered from a VMS keyboard with the compose key. If you have FORTRAN
installed, do HELP FORT CHAR DEC to get the DEC multinational set (which
is almost ISO-LATIN-1):
+------------------------------------------+
| 8 9 A B C D E F |
+---+--------------------------------------+
| 0 | DCS ° À à |
| 1 | PU1 ¡ ± Á Ñ á ñ |
| 2 | PU2 ¢ ² Â Ò â ò |
| 3 | STS £ ³ Ã Ó ã ó |
| 4 | IND CCH Ä Ô ä ô |
| 5 | NEL MW ¥ µ Å Õ å õ |
| 6 | SSA SPA ¶ Æ Ö æ ö |
| 7 | ESA EPA § · Ç × ç ÷ |
| 8 | HTS ¨ È Ø è ø |
| 9 | HTJ © ¹ É Ù é ù |
| A | VTS ª º Ê Ú ê ú |
| B | PLD CSI « » Ë Û ë û |
| C | PLU ST ¼ Ì Ü ì ü |
| D | RI OSC ½ Í Ý í ý |
| E | SS2 PM Î î |
| F | SS3 APC ¿ Ï ß ï |
+---+--------------------------------------+
! quoted printable
!
! Create a buffer with two blank lines (for some reason one
! blank line is not enough???)
!
find buffer cr_buffer
insert;
insert;
find last
!
DEFINE MACRO KQP
FIND BUFFER KQP
INSERT;s|=2c|,|w
INSERT;s|=FC|ü|w
INSERT;s|=DF|ß|w
INSERT;s|=F6|ö|w
INSERT;s|=E4|ä|w
INSERT;s|=3D|=|w
INSERT;s|=A0| |w
INSERT;s|=91|`|w
INSERT;s|=92|'|w
INSERT;s|=5F|_|w
INSERT;s|=20||w
INSERT;s|=C4|Ä|w
INSERT;s|=D6|Ö|w
INSERT;s|=DC|Ü|w
INSERT;s|=BA|º|w
INSERT;s|=95|·|w
INSERT;s|=2E|.|w
INSERT;s|=2D|-|w
INSERT;s|=E9|é|w
INSERT;s|=E1|á|w
INSERT;s|=C1|Á|w
INSERT;s|=E8|è|w
INSERT;s|=93|<I>|w
INSERT;s|=94|</I>|w
INSERT;s|=E5|å|w
INSERT;s|=96|---|w
INSERT;s|=20| |w
! Linefeed/CR Combination
INSERT;%B
INSERT;change; 9999('=0A=0D' cutsr paste=cr_buffer) ex
! CR/Linefeed Combination
INSERT;%B
INSERT;change; 9999('=0D=0A' cutsr paste=cr_buffer) ex
! Linefeed
INSERT;%B
INSERT;change; 9999('=0A' cutsr paste=cr_buffer) ex
! CR
INSERT;%B
INSERT;change; 9999('=0D' cutsr paste=cr_buffer) ex
INSERT;%B
find last
More information about the Info-vax
mailing list