[Info-vax] Mail reading/processing problem, UTF-8

Jan-Erik Soderholm jan-erik.soderholm at telia.com
Sat Aug 20 18:32:07 EDT 2011


Jan-Erik Soderholm wrote 2011-08-20 21:54:
> Hi.
>
> I'm getting automated mail send from a auction site
> to my VMS system. Now, they use UTF-8 to encode both
> the subject and the "body". An example might look like :
>
> [from:, to: and date: fields are plain text]
>
> Subject: =?utf-8?B?U8OlbHQgb2JqZWt0OiA0IHN0IFVMTjI4MDMgZGFybGluZ3Rv?=
> =?utf-8?B?biBkcml2ZXJzLiBESVAgLCAxMzcyNTc0NjEuIEvDtnBhcmU6IGJlbmd0?=
> =?utf-8?B?dy00OA==?=
> Content-Type: text/plain; charset=utf-8
> Content-Transfer-Encoding: base64
> Message-ID: <SHERRIhO29nIvX6bxu800cda4b9 at sherri.prod.tradera.com>
> X-OriginalArrivalTime: 10 Aug 2011 18:18:25.0924 (UTC)
> FILETIME=[E57D6C40:01CC5789]
>
> KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
> KioqKioqKioqKioqKioqKioqKioqKioqKg0KU8Oka2VyaGV0c3ZlcmlmaWVyaW5n
> OiBGw7ZyIGF0dCBzdHlya2EgYXR0IGRldHRhIG1haWwgc2tpY2thdHMgZnLDpW4g
> VHJhZGVyYSBiaWZvZ2FyIHZpIGRpdHQgcmVnaXN0cmVyYWRlIG5hbW46IEphbi1F
> cmlrIFPDtmRlcmhvbG0gQ29uc3VsdGluZyBBQg0KTWVyIG9tIHPDpGtlcmhldDog
>
> [rest of mail like the lines above...]
>
> I am currently looking for tools to read this, and I'm currently
> testing with the MIME.EXE tool. I can read the body with a
> sequence of OPEN and READ commands to MIME.EXE, but not the
> subject.
>
> The main problem is that my matching rules on the sbject in the
> MAIL.DELIVER file has major problems with that subject of course.
> But it would be OK to read all mails and decide later on what to
> do with them, as long as I can actualy read the subject/body.
>
> So, how to decode that "=?utf-8?B?U8O......" string into
> something readable ?
>
> And also the base64/UTF-8 body of the message, of course
>
> Any pointers are very much appreciated.
>
> And no, I do not need any "that is bad practice" tips.
> There is absolutely no way this is going to change at
> the sending side.
>
> Regards,
> Jan-Erik.
>
>


Hi again...

Well, I found a few functions that will read/decode these mails.

Where ? In the Python distribution of course... :-)

There is a module in Python called "email" that has a lot
of tools to read (and compose) MIME mails of all sorts.

This example will read a a file containing a mail that looks
like my example above and extract/decode the "Subject" and the
"body" (called "payload" in the tool). Any other header can
of course be read by other parameters to the get() function.

If you have a plain text subject in the first place, you do
not need the extra "decode_header()", of course.

==============================================================
$!
$! Reading subject and the body from a MIME mail on file.
$!
$ python

from email.parser import Parser
from email.header import decode_header

file_handle = open('MAIL$4302BF0A000500AB.MAI','r')

mail_content = Parser().parse(file_handle)

mail_subject = decode_header(mail_content.get('Subject'))
mail_body = mail_content.get_payload(decode=True)

# Now the two variables has the subject and the body
# in plain text.
==============================================================

This can now easily be runed within a command file by
DELIVER reading the symbol "MESSAGE_FILE" instead of
the hard coded file name in my example.

Nice.

Jan-Erik.



More information about the Info-vax mailing list