Skip to Content.
Sympa Menu

devel - Re: [sympa-developpers] SQLSource encoding

Subject: Developers of Sympa

List archive

Chronological Thread  
  • From: Guillaume Rousse <address@concealed>
  • To: address@concealed
  • Subject: Re: [sympa-developpers] SQLSource encoding
  • Date: Fri, 26 Jul 2013 11:21:53 +0200

Le 26/07/2013 00:36, IKEDA Soji a écrit :
On Thu, 25 Jul 2013 19:55:49 +0200
Guillaume Rousse <address@concealed> wrote:

Le 25/07/2013 19:25, IKEDA Soji a écrit :
On Thu, 25 Jul 2013 17:21:11 +0200
David Verdin <address@concealed> wrote:


Le 25/07/13 16:49, Guillaume Rousse a écrit :


Everything related to parameter binding in do_query and
do_prepared_query can be discarded, those methods don't exist anymore,
and binding is now delegated to DBI internals.
As we speak about it: why not keeping a single function to do all the
query business?
We always do the same succession of calls to do a query to database. I
refactored all them in do_query and even planned to cover the final mile
by fetching data in this sub and sending the hashref or arrayref we
needed actually in the code.

I wonder whether non-binding version of do_query() may be discarded
or not.

- Statement cache (either implemented in DB engine or emulated by DBI)
will be cleaned unexpectedly earlier, if queries by users for each
are prepared.
What is cached is the unbinded query handle ("SELECT * FROM table WHERE
id = ?"), resulting from prepare() call.

I removed all calls to $handle->terminate(), excepted in Sympa::Database
DESTROY method.

All usage of query handle to read returned values immediatly follows a
call to $handle->execute(), where final values binding actually occurs.

I don't see any scenario where a query handle resulting from
get_query_handle() would be cleaned (terminated ?), or used in an
erroneous state. As long as we are in a non-threaded application, of course.

I understood that get_query_handle() adds a "statement handle" to
the cache in DB engine (e.g. on Oracle). Is it right?
If it is right, I guessed that, once the amount of SQL statement
handled by DB engine (not Sympa::Database module) exceeds the size
predefined by DB engine, caching mechanism will become very
inefficient. Moreover, it might hit performance of other
applications using the DB engine.
OK, I get your point. The statement handle cache itself is in our Sympa::Database object. As this relies on precompiling SQL queries, there will be indeed some kind of resources mobilized in the server, as long as the handle is not freed. So, theorically, it could lead to excessive resource usage on the database server, indeed.

However, there isn't any mention of the issue in DBI documentation, which rather encourage generalized usage of pre-compiled queries. Re-reading carefully, I even found than DBI provide its own caching method (prepare_cached() instead of prepare()), and than it discourage usage of finish() on statement handles. All in all, it doesn't sound very much like 'be careful when accumulating prepared statement handles'...

However, I agree we can do better than caching everything just because we can.

From the calling context, we can sometimes explicitely decide of the best strategy:
1) we're sure than the query won't be used more than once in the process lifetime (ALTER queries, for instance): it should not be cached
2) we're reasonably sure than the query will be used more than once (INSERT queries in Sympa::Log::Dabatase, for instance): it should be cached
3) the query doesn't have any variable part (for instance, no placeholder): it should not be cached
etc...

With the limitation than the current API doesn't allow to use SELECT queries without caching them (that should be fixed), that is what I tried to achieve so far.

Now, we could also put intelligence in the database object, and use some kind of cache expiration strategy to avoid it growing indefinitly. For instance, limiting its size, or imposing some kind of TTL on cached handles. The DBI documentation suggest to use Tie::Cache::LRU, for instance.

However, I wouldn't reach into technical solution as long as there isn't any evidence of the problem it is supposed to cure...

- Before any expressive ORM is integrated into Sympa, do_query()
function to execute raw SQL seems to be required.
I'm not sure about what you cal 'raw SQL'. Both execute_query() and
get_query_handle() takes an SQL string as main argument.

Constructing this SQL string is still the responsability of calling
code. But whereas you previously inserted '%s' placeholders, and
transmitted quoted values to do_query() function, now you insert '?' as
placeholder, and you either pass values to bind to execute_query(),
either you perform binding through $handle->execute() directly.

# SELECT query
my $query = "SELECT * FROM table WHERE id = ?";
my $handle = $database->get_query_handle($query);
$handle->execute($id);
while (my @row = $handle->fetchrow()) {
....
}

# non-SELECT query
my $query = "DELETE FROM table WHERE id = ?";
my $rows = $database->execute_query($query, $id);

If we have to use e.g. get_limit_clause(), placeholder will not
work. Instead, we have to construct query string containing limit
clause then to execute it.
It still works, because get_limit_clause() is self-contained, and returns a string without placeholders. The calling code doesn't care:

my $query =
"SELECT * FROM table WHERE id = ? " .
get_limit_clause(rows_count => 1);
my $handle = $database->get_query_handle($query);
$handle->execute($id);
while (my @row = $handle->fetchrow()) {
....
}

get_limit_clause() (and similar functions) could eventually get rewritten to return separatly a query string with placeholder, and a list a parameters, but that's too much complex to my taste, without proven advantage.

--
Guillaume Rousse
INRIA, Direction des systèmes d'information
Domaine de Voluceau
Rocquencourt - BP 105
78153 Le Chesnay
Tel: 01 39 63 58 31

Attachment: smime.p7s
Description: Signature cryptographique S/MIME




Archive powered by MHonArc 2.6.19+.

Top of Page