Skip to Content.
Sympa Menu

en - Re: [sympa-users] Tracking down slow queries

Subject: The mailing list for listmasters using Sympa

List archive

Chronological Thread  
  • From: IKEDA Soji <address@concealed>
  • To: "address@concealed" <address@concealed>
  • Subject: Re: [sympa-users] Tracking down slow queries
  • Date: Wed, 22 Apr 2015 00:57:45 +0900

Steve and all,

IMO using RDBMS for spool is not good idea.

As Sympa have to support multiple RDBMSs, optimization for particular system is difficult to adopt (strictly, even LEFT JOIN cannot be used to keep portability).
Additionlly, even if it was thoroughly tuned, it still will be slower than implementation on such as NoSQL or bare file system.

In my recent experience on a site, Sympa 6.2b.9 using file system for spool achieved at least tens of posts per second without delay of delivery (In this case bulk.pl is no longer bottle-neck and new incoming_max_count parameter multiplexing daemon to process incoming messages had to be used).

I don't think RDBMS is always useless, but it seems not suitable for spool.

Regards,

--- Soji

2015/04/14 8:06、Steve Shipway (via sympa-users Mailing List) <address@concealed> のメッセージ:

Hi –

 

I have an intermittent issue on our Sympa server, where errors are thrown due to slow queries.   We have a rather large Sympa setup, but a relatively low message throughput; this is going to get very database technical and so you might want to run away in horror now if you are allergic to SQL.  However if you have a high message throughput (so a big bulkmailer table) it may be of interest.

 

I have slow query logging enabled by this in the my.cnf:

 

log_slow_queries=/var/log/mysqld.slow.log

long_query_time=2

 

A check of the slow query log indicates that the culprit is this:

 

# Time: 150414 10:30:23

# User@Host: sympa[sympa] @ localhost []

# Query_time: 20.546302  Lock_time: 0.000156 Rows_sent: 2810  Rows_examined: 2909

SET timestamp=1428964223;

SELECT messagekey_bulkspool AS messagekey FROM bulkspool_table LEFT JOIN bulkmailer_table ON messagekey_bulkspool = messagekey_bulkmailer WHERE messagekey_bulkmailer IS NULL AND lock_bulkspool = 0;

 

The query takes anything from 20 up to 40 seconds to run, presumably depending on how many items are in the bulkmailer queue.

 

It seemed to me that it might be more efficient to phrase the SQL as:

 

SELECT messagekey_bulkspool AS messagekey FROM bulkspool_table WHERE NOT EXISTS ( SELECT * from bulkmailer_table WHERE messagekey_bulkspool = messagekey_bulkmailer ) AND lock_bulkspool = 0;

 

Since the bulkmailer_table has a single key (messagekey_bulkmailer) and we’re looking for this as null in a left join, a ‘where not exists’ should be faster.  There’s no index on lock_bulkspool, but this will likely usually be 0 anyway so adding an index on lock_bulkspool seems unlikely to help.

 

However, this page (http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/ ) seems to indicate that MySQL will optimise both to the same method; though the presence of the “AND lock_bulkspool=0” may confuse the optimiser.

 

Here’s the EXPLAIN output from both forms:

 

[root@lstappprd01 log]# time mysql -u sympa --password=3bmshtr sympa -e "EXPLAIN SELECT messagekey_bulkspool AS messagekey FROM bulkspool_table LEFT JOIN bulkmailer_table ON messagekey_bulkspool = messagekey_bulkmailer WHERE messagekey_bulkmailer IS NULL AND lock_bulkspool = 0;"

+----+-------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------------------+

| id | select_type | table            | type | possible_keys | key     | key_len | ref                                        | rows | Extra                                |

+----+-------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------------------+

|  1 | SIMPLE      | bulkspool_table  | ALL  | NULL          | NULL    | NULL    | NULL                                       | 2966 | Using where                          |

|  1 | SIMPLE      | bulkmailer_table | ref  | PRIMARY       | PRIMARY | 82      | sympa.bulkspool_table.messagekey_bulkspool |    1 | Using where; Using index; Not exists |

+----+-------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------------------+

 

[root@lstappprd01 log]# time mysql -u sympa --password=3bmshtr sympa -e "EXPLAIN SELECT messagekey_bulkspool AS messagekey FROM bulkspool_table WHERE NOT EXISTS ( SELECT * from bulkmailer_table WHERE messagekey_bulkspool = messagekey_bulkmailer ) AND lock_bulkspool = 0;"

+----+--------------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------+

| id | select_type        | table            | type | possible_keys | key     | key_len | ref                                        | rows | Extra                    |

+----+--------------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------+

|  1 | PRIMARY            | bulkspool_table  | ALL  | NULL          | NULL    | NULL    | NULL                                       | 2966 | Using where              |

|  2 | DEPENDENT SUBQUERY | bulkmailer_table | ref  | PRIMARY       | PRIMARY | 82      | sympa.bulkspool_table.messagekey_bulkspool |    1 | Using where; Using index |

+----+--------------------+------------------+------+---------------+---------+---------+--------------------------------------------+------+--------------------------+

 

Hmm, looks suspiciously similar.  It seems that the optimiser in MySQL (v5.1.69 we are using) is correctly converting the ‘left join+where null’ into a ‘not exists’.  So how can I speed things up? 

 

It may be that this is the best I can do.  I managed to optimise away many of the other issues affecting the frontend by adding some indices on the lists table, and by using the Riseup patch and similar to make use of MySQL’s RLIKE instead of doing multiple regexp compares in Perl.

 

I have a query cache set up, with:

 

query_cache_type=1

query_cache_limit=1M

query_cache_size=48M

 

…but since the bulkmailer tables hold rather huge amount of data, this may not be large enough (and in any case, while the query runs every 30s in my estimation, the problem only occurs two or three times a day).

 

Does anyone have any suggestions on how this might be improved?

 

TIA

 

Steve

 

Steve Shipway

University of Auckland

UNIX Systems Design Team Lead

address@concealed

+64 (9) 3737 599 ext 86487

(GNU Terry Pratchett)

 




Archive powered by MHonArc 2.6.19+.

Top of Page