Skip to Content.
Sympa Menu

devel - [sympa-dev] Bad archiving of text messages with chartset=UTF-8

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: John Kirkland <address@concealed>
  • To: "address@concealed" <address@concealed>
  • Subject: [sympa-dev] Bad archiving of text messages with chartset=UTF-8
  • Date: Thu, 28 Oct 2010 21:22:37 -0500

Hi, developers,

I'm encountering this problem in v6.1.1.

I have a portuguese mailing list in Sympa that I'm trying to convert fully to UTF-8. One message per day is sent to the list, and it is sent as plain text. I have recently converted this plain text from iso-8859-1 to utf-8 encoding. So far, my only complaint about messed up character encoding was from an old Eudora user... everything looks normal to me in Thunderbird.

To tell the mail programs about the UTF-8 messages, I have added "Content-Type: text/plain; charset=UTF-8" as a header to the email.

The only problem is my sympa mail archives. The body of the archived message is all run together without any mhonarc-inserted, HTML formatting in it. Before this conversion to UTF-8, the archive message was neatly formatted by mhonarc with <br> and <p> at the ends of the lines.

I've done a little bit of debugging, and if I remove the Content-Type header from the email, then the message contains the mhonarc-inserted html formatting, but the characters are no longer legible. It's as if it converted the utf-8 characters from latin1 to utf-8 -- even though they were already utf-8.

I also ran mhonarc from the command line using the same command as from the sympa debug log. Now, it gets ever weirder... it works fine if I run the command from the command line.

Any ideas?

BTW, hijacking my own thread here. If I don't put the content type headers into the plain text message, then bulk.pl will crash when it attempts to perform merging (new 6.1 feature) because there is no encoding specified. The problem was in Bulk.pm around lines 288 and 304. If $charset is undefined, then from_to will 'die'.

Many thanks,
John Kirkland


  • [sympa-dev] Bad archiving of text messages with chartset=UTF-8, John Kirkland, 10/29/2010

Archive powered by MHonArc 2.6.19+.

Top of Page