Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] Working on repository

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: Guillaume Rousse <address@concealed>
  • To: address@concealed
  • Subject: Re: [sympa-developpers] Working on repository
  • Date: Mon, 10 Feb 2014 18:05:00 +0100

Le 06/02/2014 13:16, IKEDA Soji a écrit :
Well, I'll write some examples.

Several changes at sympa-cleanup were to extract parameter values
in %Conf::Conf and to give each function/method calls them as
arguments. I feel such changes going against refactoring. Moreover,
such style may ruin readability and maintenability and may invite
errors and bugs.
One of the main issue with Sympa codebase currently is the lack of components separation, and the lack of explicit interface: every part of the code access every other parts, without discrimination. This hampers refactoring (you don't know what is suspectible to be impacted) and testing (you have to fake or simulate a lot of external code to test a single function). In software engineering slang, this is called 'heavy coupling'.

The direct usage of global variables, such as the global configuration hash, or the configure-generated constants is the perfect example for this. For instance, take the remove_pid() function in tools package: you can't just call it, and check the result, because it relies on two external piece of information:
- the pid file directory (Sympa::Constants::PIDDIR)
- the temporary file directory (Site->tmpdir)

That's still testable, using dark perl arcanes, but that's much simpler to test it once the function is self-contained, with those two informations passed as explicit arguments.

The function prototype changes from:
sub remove_pid {
my ($name, $pid, $options) = @_;
...
}

sub remove_pid {
my ($name, $pid, $piddir, $tempdir, $options) = @_;
...
}

Which is now self-contained.

Once you realize than the function doesn't actually have any usage of $piddir, $tempdir, $pid, and $name excepted for constructing the names of the files containing respectively the PID and the STDERR of the daemon, you can simplify it:

sub remove_pid {
my ($pidfile, $errfile, $options) = @_;
...
}

And once you realize than the only available option is 'multiple_process', you may as well discard the additional hashref, and flatten the arguments list:

sub remove_pid {
my ($pidfile, $errfile, $multiple_process) = @_;
...
}

Asserting this is less readable and maintainable than original version is grealy exagerated.

As an added bonus, once all your functions in the tools packages are self-contained, you don't have any dependency on Sympa::Constants and Site anymore, leveraging the circular dependency issue.

So, the first motivation of decoupling is added flexibility, added testability, more explicit interfaces.

The second motivation is a defensive programmation principle: don't give any function/method more than what it needs to know. There is no reason to let a function called 'smime_parse_cert' to access the whole configuration object, just because it needs to access 'tmpdir' and 'openssl' parameters.

By passing the configuration object directly, you are exposing yourself to configuration modification, even by accident. Perl auto-vivication, for instance: a simple 'if ($hash->{$key})' test will automagically create a new key with undefined value in the tested hash, if it doesn't exist already. By passing only the two required values, you are making yourself immune against this kind of side-effects.

Such "defactoring" possibly can be one of relay points in more
general refactoring process. If that's the case, however, I don't
understand what kind of goal he is going toward. Otherwise, if
it's his goal, I can't agree to him.

Similarly, I don't understand what he aims at by preserving argument
list (@_). I guess that in many cases it would be better to break
reference to variables on caller-side (in other words, to make
subroutine calls "call by value") so that independency of each
subroutine will be improved.
I don't get your point here.

You may be refering to my preference for:
my ($a, $b) = @_;

instead of:
my $a = shift;
my $b = shift;

The second form isn't compatible with named parameters. Morevoer, keeping @_ intact would allow a generic trace_function_call(@_) method instead of requiring ad-hoc formatting in each function, such at:
do_log('debug', "get_templates_list ($type, $robot, $list)")

Or you may be refering to the fact than systematically assigning every named parameter to a local variable doesn't offer significant advantage. Especially if the variable isn't used at all, or used only once.

For instance:
sub foo {
my (%params) = @_;
my $a = $params{a};
my $b = $params{b};
my $c = $params{c};

do_something($a, $b);
}

Is strictly equivalent to:
sub foo {
my (%params) = @_;

do_something($params{a}, $params{b});
}

Or similarly, he is trying to use named arguments as much as
possible. However, Perl doesn't support function prototyping with
named arguments (e.g. as Python). It also may invite errors (typos
in argument names will silently ignored and an undefined value will
be passed). I suppose arguments would be distinguished by their
positions in argument list as much as possible (besides, optional
args may be passed as hash, not hashref. In this point I agree
to him).
I'm trying to enforce consistant style everywhere, as already stated multiple times. So the choice is between positional or named parameters.

Named parameters is usually more readable, especially when the parameters number grows. get_template_path(), in tools package, for instance, is unreadable.

Moreover, this allow to skip uneeded parameters, instead of inserting explicit 'undef' values.

And lastly, it seems to be the original recommandation here:
http://www.sympa.org/dev/contributors/developers_howto

Sure, those changes are invasives, and may introduce bugs. As any kind of change, anyway. That's one of the reason we need a test suite. However, in order to make the code testable, you have to modify it in the first place... Arguing this is too much dangereous is like arguing keeping eyes shut down is safer because it minimize the changes.

Furthermore, though this is the issue once discussed,
removing connection pooling feature from SDM just because it is
complex will obiviously unstabilize Sympa (indeed, it will become
merely unusable). To "cleanup" unneccessary things is not same as
to remove what one does not understand.
I'd rather assert than those kind of 'feature' were introduced to workaround flaky design, but I'd prefer to keep this interesting discussion for a more suitable moment.
--
Guillaume Rousse
INRIA, Direction des systèmes d'information
Domaine de Voluceau
Rocquencourt - BP 105
78153 Le Chesnay
Tel: 01 39 63 58 31

Attachment: smime.p7s
Description: Signature cryptographique S/MIME




Archive powered by MHonArc 2.6.19+.

Top of Page