devel - Re: [sympa-developpers] [sympa-commits] sympa[11063] trunk/src/lib/Sympa/Message.pm: [dev] pass required values as parameters, instead of accessing configuration directly

Subject: Developers of Sympa

List archive

Re: [sympa-developpers] [sympa-commits] sympa[11063] trunk/src/lib/Sympa/Message.pm: [dev] pass required values as parameters, instead of accessing configuration directly

From: Guillaume Rousse <address@concealed>
To: address@concealed
Subject: Re: [sympa-developpers] [sympa-commits] sympa[11063] trunk/src/lib/Sympa/Message.pm: [dev] pass required values as parameters, instead of accessing configuration directly
Date: Thu, 03 Jul 2014 14:53:41 +0200

Le 25/06/2014 12:13, David Verdin a écrit :

Hi guys,

I agree with Guillaume regarding the test issue. We need to be able to
test functions separately and easily.

However, I'd like the Message object to be able to perform some
operations standalone. For exemple, decrypting a message. When we create
a message object from sympa.pl, we just have a file that we pass to the
Message::new method.
The Message object will be able to analyse the message headers, see that
it is a crypted message and decrypt is if necessary. For that it needs
to access configfuration parameters and we can't anticipate what
parameters will be needed.

Hence my point: pass relevant parameters when performing actions. Not a bunch of parameters 'in case they would be useful somewhat later'.

Additionnaly, these parameters will change from a virtual host to
another (the DKIM private key for example). The problem is: only when a
Message has been analyzed (using the Message object) can we know what
virtual host is to be used.

I don't want the main Sympa binaries to analyze headers. We put all this
stuff in modules because they are complex operations that require
specialized modules.

There is a difference between 'analysing headers' and 'deciding which headers has to be analysed'. My point is precisely to make this distinction, and moving this control from the technical component (the object message) to its controller (not necessarily the main binary).

So maybe we could improve message handling by first calling
Message::new, then perform a set of tests in the main daemons :
Message::is_crypted, Message::is_dkim_signed() and so on. But that means
performing all these operations anytime we handle a new message,
anywhere in the code. That looks like redundant operations.

No, only where those operations make sense. For instance, you don't care about checking signature or decrypting content when dealing with bounce message.

So maybe we should regroup them in a dedicated sub, but how to factorize
such sub if we don't want modules to access configuration?

Another solution would be to do minimal parsing in Message::new(), just
to know what list and what virtual host the message is related to, and
then give them all the potentially useful parameters in one call
(Message::set_contextual_parameters(%params), for example). Taht would
mean give a lot of potentially useless parameters but, at least, we
would never have to bother passing these parameters to a pre-existing
Message object.

There is not much value with this proposal over passing the relevant parameters directly at instanciation time, ie:
$message = Sympa::Message->new(file => 'foo');
$message->set_contextual_parameters(param1 => 'value1', param2 => 'value2');
$message->do_something();

can as well be done as:
$message = Sympa::Message->new(file => 'foo', param1 => 'value1', param2 => 'value2');
$message->do_something();

With the advantage you don't have an intermediate step (an object instanciated, but not initialized yet).

They are two issues mixed here:
- removing implicit calls to context-dependant actions from message constructor
- making those actions self-contained, to avoid direct configuration access

My first set of changes has been to remove context-dependent operations (checking signature, decrypting content, checking spam status, etc...) outside of Sympa::Message constructor, ie switching from this model:
my $message = Sympa::Message->new(file => 'foobar');

To this model:
my $message = Sympa::Message->new(file => 'foobar');
$message->decrypt() if $message->is_encrypted();
$message->check_signature() if $message->is_signed();
# no need to check for spam status here, we don't care

The result is more efficient (only relevant actions are performed), more explicit (no automatic actions performed under the hood), and contribute to centralize flow control.

My second set of changes has been to remove direct access to application configuration from Sympa::Message methods, and pass the relevant information as method parameters, ie switching from this model:
$message->decrypt();

to this model:
$message->decrypt(
openssl => /path/to/openssl,
cert_directory => /path/to/directory
);

The result is the removal of dependency on the current configuration implementation in Sympa::Message, and a more specified interface, meaning less adherences between various parts of the software. Exactly the same way as getting rid of global variables in a programm. And that's also a way to keep control separated from model implementation (part of what is usually refered as Model/View/Controller design).

I agree the result may be less convenient, but definitively not technically inferior. So, what can we do to reduce this conveniency impact ?

I'm not convinced by the 'anticipated parameters transmission' strategy, aka 'pass everything that may be eventually needed later as some point, so as to able to forget them thereafter'. What do we gain this way ?

First, as you said, some of these parameters may change between the time when the object is created, and the moment they are actually used, because the object is used in another context.

Second, you'll have to store the information. It may have no actual impact when just storing a string, but it does when you store an object reference: remember perl garbage collector isn't able to detect circular references.

Third, it will be difficult to manage mandatory parameters this way, because they are only mandatory for some specific actions.
For instance, it wouldn't make sense to issue a fatal error when creating a message object without openssl parameter. But it does when trying to use it for decrypting message content.

Ie, with anticipated parameters transmission:
$message = Sympa::Message->new(file => foo);
[... later ...]
$message->decrypt()
failure: "openssl parameter was not given when the message was originaly created"

With pass-only-when-needed parameters transmission:
$message = Sympa::Message->new(file => foo);
[... later ...]
$message->decrypt()
failure: "openssl parameter missing now"

Transmitting some of those parameters at instanciation time may make sense, tough, for instance the list of allowed sender headers, because they are of general usage.

my $message = Sympa::Message->new(
file => foo,
sender_headers => [ ... ]
);
[ ... later ... ]
my $envelope_sender = $message->get_envelope_sender();
my $sender = $message->get_sender();

My point here is that general-purpose message parameters (provided they don't change between context) make sense as constructor-time parameters, but specific-purpose (everything related to S/MIME, DKIM, etc...) is better suited as method-time parameter.

But they are also additional strategies available here, mostly based on better interfaces.

For instance, wrapping $message->is_encrypted() and $message->decrypt() in a single additional $message->decrypt_if_needed() method would allow significant line count reduction:
if ($message->is_encrypted(openssl => /path/to/openssl, ...)) {
$message->decrypt(openssl => '/path/to/openssl', ...);
}
vs
$message->decrypt_if_needed(openssl => '/path/to/openssl', ...);

Reducing the number of needed parameters also help:
* 'path to openssl binary parameter could use 'openssl' as default
* passing explicitely temporary directory path may be avoid completly, in favor of using standard environment variables TMP and TMPDIR (that's what does the File::Temp module by default)

Also, most message object instanciation occurs immediatly after retrieving a serialized value from a spool (13 occurences of Sympa::Message->new(%$message_in_spool) in the tree, vs 27 occurences of Sympa::Message->new()). Having a message spool directly return Sympa::Messages instances would drastically reduce this count. That's for instance what's I already done for task spools.
--
Guillaume Rousse
INRIA, Direction des systèmes d'information
Domaine de Voluceau
Rocquencourt - BP 105
78153 Le Chesnay
Tel: 01 39 63 58 31

Attachment: smime.p7s
Description: Signature cryptographique S/MIME

Re: [sympa-developpers] [sympa-commits] sympa[11063] trunk/src/lib/Sympa/Message.pm: [dev] pass required values as parameters, instead of accessing configuration directly, Guillaume Rousse, 07/03/2014

List archive

Re: [sympa-developpers] [sympa-commits] sympa[11063] trunk/src/lib/Sympa/Message.pm: [dev] pass required values as parameters, instead of accessing configuration directly