Subject: Developers of Sympa
List archive
- From: IKEDA Soji <address@concealed>
- To: Marc Chantreux <address@concealed>
- Cc: address@concealed
- Subject: Re: [sympa-developpers] Sympatic unicode ?
- Date: Fri, 2 Mar 2018 18:56:24 +0900
Sorry for posting unfinished text.
Hi again,
In below, "text data" means Unicode text, the string with "utf8" flag
set.
*
I think the most important point is that **mail messages are not text
data**.
They should not be read / written through :utf8 layer but :bytes
layer. For example, following operations should use :bytes layer:
- Opening messages on disk.
- Opening pipe to sendmail.
* To extract text data (content of body, value of unstructured field,
parameter value of structured field), appropriate method to decode
them to text data should be used. Additionally note that:
- We should consider the case that conversion between legacy
character set and Unicode may fail.
- Round-trip conversion may not be guaranteed in general, and
decoded text is no longer identical to source byte string.
* To modify / construct a message, we should generate byte string,
instead of text data.
Secondarily important point is that **Text data is not unique**.
Text data should be normalized and (if necessary) be case-folded
at first we got it.
- Unicode allows at least two sorts of normalization form. So we
should normalize text data.
- Round-trip conversion is not guaranteed by case folding (lowercase
and uppercase conversions).
* So, we face similar problem again: text data we got is no longer
identical to source text data.
N.B.: We probablly may need more higher level of normalization (such
as domain name normalization). But I omit such things by now.
Regards,
-- Soji
2018/02/27 18:12、Marc Chantreux <address@concealed>のメール:
> hello people,
>
> i really thing Sympatic should use
>
> use utf8:all;
>
> or at least
>
> use utf8;
> use open qw< :encoding(UTF-8) :std >;
>
> what is your opinion about it ?
>
> regards,
> marc
>
-
Re: [sympa-developpers] Sympatic unicode ?
, (continued)
- Re: [sympa-developpers] Sympatic unicode ?, Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/14/2018
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/19/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/02/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/09/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/02/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/09/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, David Verdin, 03/14/2018
Archive powered by MHonArc 2.6.19+.