Subject: Developers of Sympa
List archive
- From: "Stefan Hornburg (Racke)" <address@concealed>
- To: address@concealed
- Subject: Re: [sympa-developpers] Sympatic unicode ?
- Date: Thu, 8 Mar 2018 15:54:25 +0100
On 03/08/2018 10:35 AM, IKEDA Soji wrote:
> On Fri, 2 Mar 2018 18:56:24 +0900
> IKEDA Soji <address@concealed> wrote:
>
>> Secondarily important point is that **Text data is not unique**.
>>
>> Text data should be normalized and (if necessary) be case-folded
>> at first we got it.
>>
>> - Unicode allows at least two sorts of normalization form. So we
>> should normalize text data.
>
> Example: when we run attached testutf.pl,
>
> On xfs, ext4, NFS4, CIFS etc.:
>
> $ perl testutf.pl
> => B\x{00e2}le
> <= B\x{00e2}le
> => \x{0130}stanbul
> <= \x{0130}stanbul
> => Ph\x{00fa} Qu\x{1ed1}c
> <= Ph\x{00fa} Qu\x{1ed1}c
>
> On HFS+:
>
> $ perl testutf.pl
> => B\x{00e2}le
> <= Ba\x{0302}le
> => \x{0130}stanbul
> <= I\x{0307}stanbul
> => Ph\x{00fa} Qu\x{1ed1}c
> <= Phu\x{0301} Quo\x{0302}\x{0301}c
>
> HFS+ (macOS) allows pathnames with UTF-8, but holds them in a sort of
> decomposed normalization form. Thus, even if the filesystem supports
> Unicode, comparison between pathnames on memory and filesystem may
> not always success.
>
>
> Probably there may be similar cases with database.
There is certainly a difference to databases, because with databases you can
specify
the encoding you want (e.g. xxx_enable_utf8 flags in DBI/DBD).
And I'm not sure whether your script does the right thing.
Regards
Racke
>
>
> Regards,
> -- Soji
>
>
>> Regards,
>> -- Soji
>>
>>
>> 2018/02/27 18:12、Marc Chantreux <address@concealed>のメール:
>>
>>> hello people,
>>>
>>> i really thing Sympatic should use
>>>
>>> use utf8:all;
>>>
>>> or at least
>>>
>>> use utf8;
>>> use open qw< :encoding(UTF-8) :std >;
>>>
>>> what is your opinion about it ?
>>>
>>> regards,
>>> marc
>>>
>
>
--
Ecommerce and Linux consulting + Perl and web application programming.
Debian and Sympa administration. Provisioning with Ansible.
-
Re: [sympa-developpers] Sympatic unicode ?
, (continued)
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/19/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/02/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/09/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Soji Ikeda, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Marc Chantreux, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/02/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, IKEDA Soji, 03/09/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
Stefan Hornburg (Racke), 03/08/2018
-
Re: [sympa-developpers] Sympatic unicode ?,
IKEDA Soji, 03/08/2018
- Re: [sympa-developpers] Sympatic unicode ?, David Verdin, 03/14/2018
Archive powered by MHonArc 2.6.19+.