devel - Re: [sympa-dev] Charset/encoding for e-mail message

Subject: Developers of Sympa

List archive

Re: [sympa-dev] Charset/encoding for e-mail message

From: Hatuka*nezumi - IKEDA Soji <address@concealed>
To: Olivier Salaün - CRU <address@concealed>
Cc: address@concealed
Subject: Re: [sympa-dev] Charset/encoding for e-mail message
Date: Fri, 20 Oct 2006 22:14:30 +0900

On Thu, 19 Oct 2006 17:49:13 +0200
Olivier Salaün - CRU <address@concealed> wrote:

<<snip>>
> > Strings used for interpolation on TT2 are interpreted as if they
> > are encoded by ISO-8859-1. Anyhow curious this is ---
<<snip>>
> I'm wondering if your problem might be related to something I fixed
> yesterday in the CVS HEAD :
>
> The logging subroutine (do_log()) does recode its parameters from
> UTF-8 to the filesystem_encoding. (This is required because syslogd
> does not seem to cope well with UTF-8) I found out, while applying
> your patch, that do_log() was not only recoding the values of the
> parameters but also the variables themselves. I fixed this.
> Therefore can you have a try with the latest CVS HEAD before we go
> on investigations on this topic ?

I checked out CVS HEAD (up to fix on language box).

- List subjects seems to be OK, both on Web and service messages.
- Language box is OK.

<<snip>>
> > (b) Reproduced on:
> > INFO Service message:
> > - Description of list.
> >
> I was not able to reproduce this problem ; but maybe I need to try with
> non-ISO-8859-1 data...
>
> > Web:
> > - Help pages installed into web_tt2/ja_JP/ (UTF-8 is used).
> >
> > Afterwards, I made a quick hack on src/tt2.pl (as patch
> > attached). Then this seems not to be reproduced, probably.
> >
> If the problem persist, please provide us a step by step way to
> reproduce the problem.

Next, I applied only my tt2 hack patch (sympa-MAIN-20061015-tt2_utf8.patch).

- Description of list in INFO is OK.
- Help pages installed into web_tt2/ja_JP/ (UTF-8 is used) are OK.

Finally, I applied all of my patches
(sympa-MAIN-20061019-mail_encoding_suppl.patch).

- Things had not become worse. All things that was OK above are OK.

Though ---

(d) UTF-8 strings are fed to TT2:
Strings encoded by UTF-8 are inserted into message bodies
generated from at least following templates:
get_archive.tt2 (first and last parts)
moderate.tt2
welcome.tt2
At least following are OK:
helpfile.tt2
command_report.tt2
reject.tt2
bye.tt2

<<snip>>
> >>> - When address headers of service messages include non-ASCII
> >>> characters, headers will be encoded maliciously.
<<snip>>
> > On TT2 templates, this will be avoided by attached (second) patch,
> > but this may not be generalized solution.
> Another solution is to put the [% FILTER qencode %] at the right place
> in the TT2 files, example below :
>
> To: [% FILTER qencode %][%|loc(list.name)%]Moderators of list
> %1[%END%][%END%] <[% list.name %]-editor@[% list.host %]>
>
> I've fixed the mail_tt2 files according to this.
> I don't know if we still need your patch...

By this solution some lines will exceed limit of 76 characters which
MIME requires.

> > I believe that the
> > structured header fields in general need to be parsed/constructed
> > by another functions not just only processing B/Q encodings.
> >
> What other solution do you propose ?

Currently I have no idea. It might be an enhanced version of such
as Mail::Address::format(). I would like to consider its feature
when it is required inevitably.

<<snip>>
> BTW : In your previous message, you reported a problem related to
> duplicated header fields. I found out that the problem was related to
> our mail::mail_file() subroutine incorrectly detecting folded header
> fields. Here is the patch :
> http://sourcesup.cru.fr/cgi/viewcvs.cgi/sympa/src/mail.pm?r1=1.37&r2=1.38&makepatch=1&diff_format=u

Headers become not to be duplicated. Thanks!

--- nezumi

Re: [sympa-dev] Charset/encoding for e-mail message, Hatuka*nezumi - IKEDA Soji, 10/15/2006
- Re: [sympa-dev] Charset/encoding for e-mail message, Olivier Salaün - CRU, 10/18/2006
  - Re: [sympa-dev] Charset/encoding for e-mail message, Hatuka*nezumi - IKEDA Soji, 10/19/2006
    - Re: [sympa-dev] Charset/encoding for e-mail message, Olivier Salaün - CRU, 10/19/2006
      - Re: [sympa-dev] Charset/encoding for e-mail message, Hatuka*nezumi - IKEDA Soji, 10/20/2006
        
        Re: [sympa-dev] Charset/encoding for e-mail message, Olivier Salaün - CRU, 10/27/2006
      - Re: [sympa-dev] Charset/encoding for e-mail message, Hatuka*nezumi - IKEDA Soji, 10/26/2006
    - Re: [sympa-dev] language names (was Charset/encoding for e-mail message), Olivier Salaün - CRU, 10/20/2006
      - Re: [sympa-translation] Re: [sympa-dev] language names (was Charset/encoding for e-mail message), Hatuka*nezumi - IKEDA Soji, 10/26/2006
        
        Re: [sympa-translation] Re: [sympa-dev] language names (was Charset/encoding for e-mail message), Olivier Salaün - CRU, 10/27/2006
  - Re: [sympa-dev] Charset/encoding for e-mail message, Sylvain Amrani, 10/20/2006
    - Re: [sympa-dev] Charset/encoding for e-mail message, Hatuka*nezumi - IKEDA Soji, 10/21/2006

List archive

Re: [sympa-dev] Charset/encoding for e-mail message