Skip to Content.
Sympa Menu

en - Re: [sympa-users] bulk.pl is frequently crashing silently

Subject: The mailing list for listmasters using Sympa

List archive

Chronological Thread  
  • From: micah <address@concealed>
  • To: Derek Lofstrom <address@concealed>, "Sympa-users\@listes.renater.fr" <address@concealed>
  • Subject: Re: [sympa-users] bulk.pl is frequently crashing silently
  • Date: Wed, 26 Aug 2015 11:15:54 -0400

Derek Lofstrom <address@concealed> writes:

> We are running Sympa 6.2.3-1.20150717.RHEL6 on CentOS 6.7 with all the most
> recent updates. Several times over the past month, we've been experiencing
> an issue where bulk.pl silently dies without any warning, and bulk mail
> processing stops, causing delivery delays that are often not discovered
> until several hours (or in some cases days) later. We first experienced
> this after upgrading to v 6.2.1 and attempted upgrading to 6.2.3 to see if
> the issue was resolved (which it has not been). We are a very moderate
> mailing list user organization; as of this date, we only have 12 or so
> lists that get distributed to once a day or less, so it's not like we are
> processing a huge workload. But the lists that are being used are extremely
> important to our institution.

...

> Has anyone experienced this? I having difficulty finding information on the
> web or any user forums and do not see an existing bug logged for this
> specific issue (I found one for 2010 pertaining to issues relating to
> forking, but that it was apparently resolved in 6.2).

We've experienced this in the past - it usually is the result of one
specific message in the bulk spool, typically a message with some
interesting encoding... once we remove that message (and restart),
things stop dying. It isn't always easy to find the message causing the
problem, sometimes we just do a binary search by placing half the
entries in the bulk directory in a temporary directory, starting things
up, and seeing if it crashes. If it doesn't, then feed in another half
from the temporary directory, until we narrow it down.

We haven't had this happen for some time now, but we did have a
particular list that kept causing this, so when things died we would
guess that it was that bulk spool entry and move it out of the way and
restart and things would work 99% of the time.

We setup nagios to alert us when bulk.pl dies, so we notice it,
otherwise days would go by without any processing.




Archive powered by MHonArc 2.6.19+.

Top of Page