Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] Sympatic unicode ?

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: Marc Chantreux <address@concealed>
  • To: address@concealed
  • Subject: Re: [sympa-developpers] Sympatic unicode ?
  • Date: Thu, 8 Mar 2018 16:11:54 +0100

hello,

to get into this anwser: please consider i don't see myself as a unicode
guru and my opinions, not so strong, is made of ready to use reciepes
that just worked well for me in the past.

On Thu, Mar 08, 2018 at 09:52:48PM +0900, Soji Ikeda wrote:
> :encoding and :utf8 behave differently each other. See the manual.
> It depends which is better (or neither suit).

as i tried with autodie, the idea here is to have a default behavior
that cause less suprises to someone who don't know about perl. the
default behavior is probably not the expected one but is there some ?

use utf8::all;

is like

use open qw< IO :utf8_strict >;

for a good introduction to perl unicode internals, this talk from
ricardo is very good:

https://www.youtube.com/watch?v=TmTeXcEixEg

where he explains why not to use :utf8 or :bytes as default.

my experience is that utf8::all spared my time while i could forget
about encoding most of the time.

> > * should we load charnames ? probably not.
> I don’t mind. Why do you think so?

for me, closing gh#24 is still something i expect at some point as Aleks
was just reporting something i already heard a lot about sympa ... so we
should probably should be carefull when we load a table of symbols.

on another side, i don't think we should push the memory bytes chase too
hard. this is a trade off and i don't really know what to think.

i was curious so i wrote this script:

export m=Memory::Usage

dump ()
perl $with -M$m -MEnv=m -we'$_=$m->new; $_->record(""); $_->dump'

with=() dump
with=( -Mcharnames=:full,:short ) dump

on an old old stable (perl 5.14), i got

perl w time vsz ( diff) rss ( diff) shared ( diff) code (
diff) data ( diff)
5.14 w/o 0 17544 ( 17544) 3788 ( 3788) 2992 ( 2992) 8 (
8) 1360 ( 1360)
5.14 w 0 18076 ( 18076) 4284 ( 4284) 2960 ( 2960) 8 (
8) 1892 ( 1892)
5.24 w/o 0 18560 ( 18560) 5340 ( 5340) 4084 ( 4084) 1944 (
1944) 1736 ( 1736)
5.24 w 0 23356 ( 23356) 7796 ( 7796) 4208 ( 4208) 1944 (
1944) 4004 ( 4004)

> fc() has already been used in Sympa 6.2. Please read source code.
> I don’t understand what is “weird bug”.

sorry for the "weird" ... something unexpected the first time you find
it (at least i was surprised).

> > * should @ARGV be unicode characters: i think so ... not strongly.

> I don’t know. Why do you think so?

i really told "not strongly" so next line is just thought.

i really consider myself as old and i have the chance to work with
junior developpers (almost half my age). for all of them, encoding
problems are "problems from the past".

> > * should decoding errors be Encode::FB_CROAK ? recent posts seems to
> > show that you're not fan of making perl die as soon as possible.

> FB_CROAK behaves such. I don’t stop using Encode only this reason. Please
> read my posts.

sorry i wasn't clear... "you" didn't mean "Soji" but "members of the
community",
how about the others ?

regards
marc






Archive powered by MHonArc 2.6.19+.

Top of Page