Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] Use of UTF-8 --- Re: sympa [9919] branches/sympa-cleanup/src/sbin: [dev] use utf8 characters directly

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: Guillaume Rousse <address@concealed>
  • To: address@concealed
  • Subject: Re: [sympa-developpers] Use of UTF-8 --- Re: sympa [9919] branches/sympa-cleanup/src/sbin: [dev] use utf8 characters directly
  • Date: Mon, 25 Nov 2013 13:06:11 +0100

Le 22/11/2013 10:44, IKEDA Soji a écrit :
Hi Guillaume and all,

You replaced some characters in sources with raw UTF-8 strings.
Some of your changes won't cause expected results.

(1) Earlier releases (5.8.x) of pod2man do not recognize "=encoding"
POD directive. In addition, some of them deny to generate
manpages from PODs including non-ASCII bytes.

So POD E<...> markup should be used instead of raw UTF-8 sequences.
OK.

(2) Most components of Sympa do not have "use utf8" pragma so that
UTF-8 strings in the sources will be handled as multiple bytes
(one of a little exceptions is Marc::Search).

So raw UTF-8 in the sources (especially regexp) might not work
as expected.
Then this pragma should be probably enforced everywhere, or nowhere at all.


I will made suggestions.

[8053] branches/sympa-cleanup/src: [dev] conversion to utf8

<https://address@concealed>

- arc2webarc.pl.in:
bytes 0xE9 and 0xFB in regexps were replaced with multibyte
sequences 0xC3 0xA9 and 0xC3 0xBB. These would rather be string
escapes "\xE9" and "\xFB". ...(2)

- Sympa/Message.pm:
- sympa_wizard.pl.in:
A byte 0xFC was replaced with raw UTF-8. This should rather be
POD markup "E<252>". ...(1)

[9919] branches/sympa-cleanup/src/sbin: [dev] use utf8 characters directly
<Attached below>

- All changes:
POD markup E<...> should be used. ...(1)
You'd better fix the code directly instead of suggesting those changes. First, you have better knowledge on those encoding issues than anyone else among us. Second, I don't have any time available to work on sympa currently, and you don't want to be a bottleneck here.

As a side note, tough, most of these encoding issues were caused by the presence of an authors list, including O. Salaün name, in some files, but not in others, without much apparent logic. Hence my recurrent point than file-based authorship doesn't make much sense, and should be centralized in one single place only for the whole project.
--
Guillaume Rousse
INRIA, Direction des systèmes d'information
Domaine de Voluceau
Rocquencourt - BP 105
78153 Le Chesnay
Tel: 01 39 63 58 31

Attachment: smime.p7s
Description: Signature cryptographique S/MIME




Archive powered by MHonArc 2.6.19+.

Top of Page