Subject: Developers of Sympa
List archive
[sympa-dev] sympa 6.0.1 and wwsympa scaling issues
- From: address@concealed
- To: address@concealed, address@concealed
- Cc: address@concealed, address@concealed
- Subject: [sympa-dev] sympa 6.0.1 and wwsympa scaling issues
- Date: Thu, 18 Feb 2010 20:13:16 -0800 (PST)
Greetings!
I have been working with the lists.riseup.net list server, which hosts
14,000+ lists and almost 3 million subscribers. We recently upgraded from
sympa 5.3.4 and have seen some old and new scaling issues with wwsympa.
I'm hoping I can work with some of the sympa developers to figure out the
best way to fix these issues. I'm also wondering if anyone working with
sympa has noticed anything similar.
Before I list the problems, I'd like to mention that generally we've
noticed a decreased use of both CPU and memory resources on the server,
which is great!
The first issue is a known issue from 5.3.4: any call to
List::get_lists('*') that does not pass an optional set of lists requires
a traversal of the entire lists data directory. This is obviously quite a
feat for a system of this size! Disabling the ability to list all lists
is one thing we have done in response to this issue. However, to list
pending and closed lists, for example, also requires a read of the lists
data directory. We actually have a patch for this issue which stores all
of the lists and some basic configuration information in a mysql table.
Our modified source can be browsed with the git repository:
https://labs.riseup.net/code/repositories/browse/sympa/sympa-6.0.1-src
I have read some mention of someday sympa having its configuration files
stored in the database. I'm wondering if this is similar to the work we've
done in this direction? If we can be of any assistance with this, please
let us know!
The second issue that we noticed was that some queries to arcsearch_id
were resulting in wwsympa processes that were using 100% of the cpu and
running for a very long time. The few times I managed to run an strace on
these processes, it seemed like they were traversing the lists data
directory! However, the *really* odd part of this was that this was
happening for unauthorized requests to arcsearch_id. The even stranger
part is that after a user POSTed to arcsearch_id and received a timeout
error, any request they submitted after this would also time out.
Our fix for the runaway wwsympa processes was this patch:
- return undef unless (defined &check_authz('do_arcsearch_id',
'web_archive.access'));
+ unless (defined &check_authz('do_arcsearch_id',
'web_archive.access')) {
+ param->{'action'} = 'authorization_reject';
+ param->{'reason'} = 'web_archive_closed';
+ return 1;
+ }
The second part of this issue has me a little more perplexed. It seems
related to session data - the sessions for those users had stored a
redirect to the arcsearch_id request. Our quick-fix for this was to add
arcsearch_id to the %temporary_actions hash. Is there a better way to
approach this issue?
The last thing we are seeing, after fixing the aforementioned, is that we
still have an occasional wwsympa process using 100% of the cpu and never
terminating. These processes mostly seem to be requests to either
archives or rss feeds. The frustrating thing about this now is that I
have had a very difficult time finding a request that consistently results
in the error. I have noticed that many of the requests to archives which
result in this error involve lists that have been around for at least a
few years, so they perhaps have a large volume of archives. Still, it is
baffling to me that the requests sometimes complete successfully but
sometimes don't. I realize that there are other relevant factors here --
like other processes that are running on the server and impacting the
server's load -- but we didn't have this problem with 5.3.4, so I'm
wondering what might have changed.
Thanks for your attention! If you want any further information in the way
of logs, please let me know.
Kristina
- [sympa-dev] sympa 6.0.1 and wwsympa scaling issues, kclair, 02/19/2010
Archive powered by MHonArc 2.6.19+.