Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] Sympatic unicode ?

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: Marc Chantreux <address@concealed>
  • To: "Stefan Hornburg (Racke)" <address@concealed>
  • Cc: address@concealed
  • Subject: Re: [sympa-developpers] Sympatic unicode ?
  • Date: Thu, 8 Mar 2018 12:47:32 +0100

hello,

> I'm for using UTF-8 all across the board if possible, we just need to
> understand the consequences.

sure.

is those lines seems very well known to me now, i don't know if i have
to detail (please don't hesitate to ask questions anyone).

use utf8;
use open qw< :encoding(UTF-8) :std >;
use feature qw< unicode_strings >;

utf8::all is more questionable:

* should we load charnames ? probably not.
* should we use fc and unicode_eval features by default when we expect
perl 5.14 to be the minimum requisit? probably not. i guess only fc
will be missed as this is the very good way to avoid some weird bugs
(https://metacpan.org/pod/distribution/perl/pod/perlfunc.pod#fc)
* should @ARGV be unicode characters: i think so ... not strongly.
* should decoding errors be Encode::FB_CROAK ? recent posts seems to
show that you're not fan of making perl die as soon as possible.

> I suppose the database already use UTF-8 encoding,
> what about the config files and other files Sympa reads/writes?

maybe i'm wrong but utf8 is the really basics nowadays so as we are
talking about the next major release, require all the config and data
files to be utf8 seems to be an acceptable requisit.

plus: when we talked about the next version of sympa, we had this
discussion about the database as reference (as config files should
become a convenient interface to it). it would be hard then to
maintain things if we use multiple encodings.

the macos filename problem spoted by Soji is probably right but how many
people will be impacted by it? the only case i see (but i'm not sympa
expert) is a shared document on a macos server. we can work around it
by providing filename sanitization strategies (for exemple:
Text::Unidecode should be one).

regards
marc



Archive powered by MHonArc 2.6.19+.

Top of Page