Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] Sympatic unicode ?

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: "Stefan Hornburg (Racke)" <address@concealed>
  • To: Soji Ikeda <address@concealed>
  • Cc: address@concealed
  • Subject: Re: [sympa-developpers] Sympatic unicode ?
  • Date: Thu, 8 Mar 2018 16:31:38 +0100

On 03/08/2018 04:24 PM, Soji Ikeda wrote:
> racke,
>
> 2018/03/08 23:58、Stefan Hornburg (Racke) <address@concealed>のメール:
>
>>>> On 03/08/2018 12:52 PM, Marc Chantreux wrote:
>>>> On Fri, Mar 02, 2018 at 05:55:22PM +0900, Soji Ikeda wrote:
>>>> They should not be read / written through :utf8 layer, but :bytes layer.
>>>> E.g. following operations should use :bytes layer:
>>>> - Opening messages on disk.
>>>> - Opening pipe to sendmail.
>>
>> We should rather use CPAN modules than opening a pipe to sendmail ...
>
> I don’t mind if it is performed by wrapping module. I described the case
> that :utf8 layer should not be used (see also comment blow).
>
>>>
>>> what's the point of using :bytes everywhere just because mails should be
>>> serialized this way ?
>>>
>>> those special cases (even if happens frequently) should be wrapped into
>>> functions that ensures the correctness.
>>>
>>> regards
>>> marc
>>
>> Yes, I would agree with Marc.
>>
>> We are doing the following inside our Dancer apps:
>>
>>
>> # the dumper shows \x{20ac}, so html and text are decoded.
>> email {
>> %args,
>> body => encode( 'UTF-8', $text ),
>> type => 'text',
>> attach => {
>> Charset => 'utf-8',
>> Data => encode( 'UTF-8', $html ),
>> Encoding => "quoted-printable",
>> Type => "text/html"
>> },
>> multipart => 'alternative',
>> };
>>
>> Here "email" is basically a wrapper around Email::Sender
>> (https://metacpan.org/pod/Dancer2::Plugin::Email#DESCRIPTION).
>
> In that case email is crafted by program itself: Internal encoding may be
> Unicode and resulting message may be freely encoded to UTF-8 (or other char
> set).
>
> However we have to process incoming messages possibly encoded by legacy
> chaset and transfer-encoding. Because we should keep the content unchanged
> octet-by-octet (or we might break integrity of signature etc.), it may not
> be decoded to Unicode. After all, we have to treat message as byte string,
> not text data.
In the normal case a raw email is just ASCII, isn't that correct? Which
exceptions do you know?
>
> Does my description miss the point?

I think your sentence above clarified your earlier descriptions.

Regards
Racke

>
> Regards,
> — Soji
>
>> Regards
>> Racke
>>
>> --
>> Ecommerce and Linux consulting + Perl and web application programming.
>> Debian and Sympa administration. Provisioning with Ansible.
>
>


--
Ecommerce and Linux consulting + Perl and web application programming.
Debian and Sympa administration. Provisioning with Ansible.



Archive powered by MHonArc 2.6.19+.

Top of Page