Skip to Content.
Sympa Menu

en - RE: [sympa-users] How to increase Archiver throughput?

Subject: The mailing list for listmasters using Sympa

List archive

Chronological Thread  
  • From: Steve Shipway <address@concealed>
  • To: Matt Taggart <address@concealed>, "address@concealed" <address@concealed>
  • Subject: RE: [sympa-users] How to increase Archiver throughput?
  • Date: Wed, 26 Mar 2014 21:51:13 +0000

Thanks for the reply....

Our Sympa installation runs on a VMware guest (12G memory, address@concealed)
with SAN disk, so disk IO is pretty good and there's not much we can do
about it in any case. The SAN team have extensive performance monitoring in
place and pull a pretty good performance out of it all that is nowhere near
redlining yet - they are running over a petabyte of disk on countless
spindles. I've viewed the VMware IO stats and things look fine there.

* Are you using the sympa upstream version of mhonarc-ressources.tt2 or a
custom version? What version of mhonarc?

We have a slightly customised resources file -- basically, we're just adding
some urlisation options.

* Anything else on the system that might be causing a bottleneck?

Have checked all this and seems OK. The VM farm is healthy, and nothing
comes to mind but maxing out on a single CPU thread.

* Long ago we had to switch to ext4 because ext3 wasn't able to handle
everything. We run newer backported kernels in order to have the latest
performance/fixes to all the block layer stuff (we use ext4+lvm+dmcrypt+md
raid).

* What is your web archive load like? If you have lots of people browsing
the archives all the time (or web crawlers) that could disrupt write i/o.

We have webcrawlers blocked off by robot.conf, and less than a dozen people
browsing the archives even at peak, so little load there. Part of the issue
is simply how many things there are to archive -- we're getting lists being
used for automatic notifications that still need archiving so it is a very
high number of messages to archive, hence wondering if I can multithread the
archiving process.

It seems to max out at about 1,000 messages/sec when archiving, so if they
come in faster than this, a backlog starts. The distribution, however, has
20 threads and so can cope with this rate with no problem as many of our
high-volume lists are also low-membership.

* How do you backup the archives and when does that happen?

Overnight TSM differential. This finishes pretty fast.

* Can you determine if the load is being caused by particular lists? If so
maybe you can think of ways to isolate those particular lists, maybe putting
them on a separate drive and using symlinks.

For the time being, we've disabled archiving on two particularly high volume
lists, which has brought things back under control. However I'd like to be
able to increase the capability of the archival system.

Separate drives would have no effect as they'd still be VMware vdisks on the
same underlying SAN. We can squeeze a bit more performance by adding
multiple virtual SCSI adapters but (as mentioned) it doesn't seem to be IO
that is our bottleneck.

It seems strange that the archival process is single-threaded where the
distribution is multi-threaded. This would seem to be a good Feature
Request for 6.2 :)

Thanks for the help,

Steve Shipway
address@concealed


Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.19+.

Top of Page