Skip to Content.
Sympa Menu

en - Re: [sympa-users] Digest bugs

Subject: The mailing list for listmasters using Sympa

List archive

Chronological Thread  
  • From: Chris Hastie <address@concealed>
  • To: Olivier Salaun - CRU <address@concealed>
  • Cc: address@concealed
  • Subject: Re: [sympa-users] Digest bugs
  • Date: Mon, 16 Feb 2004 12:16:12 +0000

On Mon, 16 Feb 2004, Olivier Salaun - CRU <address@concealed> wrote:

> Chris Hastie wrote:
>
> > The other problem that I noticed when testing this is that the default
> > digest template declares the CTE for the TOC part to be 7bit. Clearly,
> > if the subjects given have had their mime words decoded it is quite
> > possible that 8 bit characters will appear in this part. There is also
> > going to be an issue with the declared character set in this part, but
> > I suspect we may just have to live with that unless we do translate
> > the whole lot to UTF-8. It's worth looking at MIME::WordDecoder for
> > some more flexible ways of decoding and specifying ways of handling
> > unusual characters and character sets.
>
> There's nothing we can do unless everything is UTF8-encoded.
> We are assuming 7bit for english digests whereas French one is 8bit
> iso-8859-1... All digest templates have different charsets defined.
>

What I'm suggesting is that assuming 7bit for English digests is dangerous.
Even
in English there are several 8bit characters in regular use, particularly this
side of the Atlantic, where the symbols representing the units of currency of
both the UK and Ireland (and, of course, all the non-English speaking bits of
the EU) are 8bit characters and may well turn up in the TOC as a result of
decoding encoded Subject: strings. And of course there are plenty of English
speaking lists with subscribers who are not native English speakers and whose
name may well include 8bit characters, encoded in the From: header.

Character sets are always going to be difficult. My feeling, admitadly from an
English speaking view, is that MIME::WordDecoder offers a little more
flexibility in how encoded words using differing character sets are dealt
with.

--
Chris Hastie



Archive powered by MHonArc 2.6.19+.

Top of Page