devel - [sympa-developpers] Logging subsystem and errors handling

Subject: Developers of Sympa

List archive

[sympa-developpers] Logging subsystem and errors handling

From: Guillaume Rousse <address@concealed>
To: address@concealed
Subject: [sympa-developpers] Logging subsystem and errors handling
Date: Tue, 01 Oct 2013 10:51:56 +0200

Le 19/09/2013 17:35, David Verdin a écrit :

Logs

* We will get rid of interpolation in do_log calls (replace
Sympa::Log::do_log('err', "Could not open $file"); by
Sympa::Log::do_log('err', 'Could not open %s', $file);)

Doesn't matter very much IMHO. What matters, tough, is being consistent globally.

* We will inspect logs to distinguish, amongst the different "debug"
levels, which are relevant to end users and which are not. The first
will be put in log_message calls, the former in log_trace_message
calls. There is a problem with this simplification: some functions
are called very often and, if all debug logs are actually printed,
in an instance with a big activity, the performances will severely
decrease. It can even stall the server. So this simple dichotomy
between developpers and users could be problematic.

Using either one or two different functions won't make the situation worse, that's even the opposite.

I'd suggest to use compile-time optimisations to get rid of those function calls, such as exposed here:
http://perlmaven.com/constants-and-read-only-variables-in-perl

in Sympa::Constants:

use constant DEBUG => $ENV{SYMPA::DEBUG};

In all other places of the code:

log_trace_message("called foo() with parameters bar")
if Sympa::Constants::DEBUG;

Also, I don't think we really need to be able to trace every single function, as we have other debugging tools (unit tests, perl debugger, etc...). In particular, I'd like to avoid circular dependencies between Sympa::Tools::* (where the various data structure pretty-printing functions live) and Sympa::Log namespaces (where the logging functions live). We should rather aim 100% test coverage for various Sympa::Tools modules, and kill any dependency toward other part of Sympa code from there.

* 'warn' log level is useless. Let's have only two levels: "info" and
"err". That's enough.

That's three, actually, if you want also a user-oriented "debug" level.

* Sympa::Log::Database is aimed at providing web-readable logs for
list owners. It logs only main events to make them available through
Sympa web interface.

My understanding of the code is than Sympa::Log::Syslog is a generic logging subsystem, used by every part of the code, to store traces in system log, whereas Sympa::Log::Database is a specific logging subsystem, used by only specifc parts of the code, to store performances values in the sympa database. Ie, they are not alternatives to each others, despite their namespace similarity.

On the other hand, they are some primitive attempts in the code to use stdout instead of syslog to manage events reporting, especially for command-line tools. This would suggest further splitting the current Sympa::Log::Syslog code into Sympa::Log::Syslog and Sympa::Log::Stdout, and trying to make the last one an alternative to the first. Which requires be easier with an OO interface:
Sympa::Logger
|- Sympa::Logger::Syslog
|- Sympa::Logger::Stdout

This way, calling code would just receive a Sympa::Logger object, and call $logger->log_message() on it.

So, basically, I'd try to have one namespace (Sympa::Logger ?) for the generic event logging subsystem, and another (Sympa::Tracer, Sympa::Perf, ...) for the other subsystem.

That's not a priority, tough.

* the web specific logs are useful because they provide informations
specific to the web. It is important when analyzing logs. All three
logging functions have their rational. I don't know how to simplify
this part of the code.

The log functions in wwwsympa are just wrappers adding additional details to the message sent to the logs, which seem fine for me.

However, having a special case in do_log() function for computing the stack trace, when it is called by those wrappers, is less happy.

Exception handling seems not to be resolved. the fatal_err calls were
usefull because they specified explicitely that Sympa should not keep on
running under certain circumstances. This is a clear distinction with
the return undef we used as exception handling.
I am under the impression that all return undef could be replaced by
croak calls, providing we intercept exceptions at the right place and we
use the Carp option allowing to produce stak traces.
But, we need to distinguish the very few circumstances in which sympa
should stop running, so that the daemons will stop gracefully.
Inspecting the code, it looks like all fatal_err calls were made in the
main daemons, so I think it should be easy.

I wouldn't reach into exceptions usage everywhere right now, and keep historical error handling in place for the moment, excepted for a few places:

- replacing fatal_err() calls by croak() seems OK
- self-containted part of the code (for instance, all database schema modification functions are called from probe_database() functions, where exceptions are handled) can also switch to exceptions, as they won't propagate everywhere.

However, I'd change all explicit 'return undef' by simple 'return' statement, for reasons exposed in PBP (basically, 'return undef' will be evaluated as a true value in list context, whereas 'return' doesn't).

So, here is proposal for short-term priorities:
- fix explicit 'return undef' statement
- get rid of variable interpolation in log messages
- split trace messages out of other messages
- get rid of 'warn' level
- replace fatal_err() calls by croak

And longer-term priorities:
- switch error handling to exceptions everywhere
- enforce better namespace separation for the two different logging subsystems
- allow to use stdout as an alternative log destination to syslog

--
Guillaume Rousse
INRIA, Direction des systèmes d'information
Domaine de Voluceau
Rocquencourt - BP 105
78153 Le Chesnay
Tel: 01 39 63 58 31

Attachment: smime.p7s
Description: Signature cryptographique S/MIME

[sympa-developpers] Logging subsystem and errors handling, Guillaume Rousse, 10/01/2013

List archive

[sympa-developpers] Logging subsystem and errors handling