MySQL Prepared StatementsPeter Zaitsev
If you care about archiving best performance in your application using MySQL you should learn about prepared statements. These do not neccesary provide performance beneft but they may, they also have other benefits.
As a quick introduction – before MySQL 4.1 there were only textual statements and textual protocol for data transfer – query was sent as text and result returned back as text. For example number 123 would be sent as string “123”. Such protocol had serious performance implication – queries had to be parsed fully each time, all return values had to be converted to the strings on server side and back on the client side, which is pretty expensive especially for certain data types. Furthermore BLOBs require escaping as not all characters could be used in textual protocol, which not only consumed time but also required extra memory consumption both on server and client.
So in MySQL 4.1 Prepared statement came. Do not mix these native Server Side Prepared Statements with emulated prepared statements which JDBC, ODBC and some other drivers had forever. Now query passed to the server contained placeholders for parameters, for example “select name from user where id=?” and executing query with different parameters did not require passing full query to the server and full parsing. (Note: Optimization phase currently remains). Not only this is faster if you execute statement serveral times but also it is more secure – it saves you from many variants of “SQL Injection” – passing malformed data so query changes it meaning.
You also can bind direct variables to return columns – which means not only data does not need to be converted back and forth but you also save extra data copying from row structure to your variables.
So how much extra performance can you get by using prepared statements ? Results can vary. In certain cases I’ve seen 5x+ performance improvements when really large amounts of data needed to be retrieved from localhost – data conversion can really take most of the time in this case. It could also reduce performance in certain cases because if you execute query only once extra round trip to the server will be required, or because query cache does not work.
I’ve done a simple benchmark (using SysBench) to see performance of simple query (single row point select) using standard statement, prepared statement and have it served from query cache. Prepared statements give 2290 queries/sec which is significantly better than 2000 with standard statements but it is still well below 4470 queries/sec when results are served from query cache.
So there are good reasons to use prepared statements:
- Save on query parsing
- Save on data conversion and copying
- Avoid SQL Injection
- Save memory on handling blobs
There are also drawbacks and chewats of using prepared statements:
- Query cache does not work
- Extra server round trip required if statement used only once
- Not all statements can be prepared. So you can’t use prepared API exclusively you’ll need to fall back to normal API for some statements
- Newer and sometimes buggy code. I had a lot of problems with PHP prepared statements. It is getting better but still it is less mature than standard API
- You can’t use placeholders in place of all identifiers. For example you can’t use them for table name. In certain version it even does not work for LIMIT boundaries
- Inconvenient list handling. Unlike in for example PEAR emulated prepard statements there is no nice way to pass list of values to IN
- Harder tracing. Logs were now fixed to include full statement text not only “Execute” but in SHOW INNODB STATUS you would still see statements without actual values – quite inonvenient for analyses.
There is no statement cache – You can allocate multiple copies of same prepared statement and they will each use separate structures on the server. It does not matter if you do it from same connection or multiple connections.
Do not forget to close prepared statements – Many memory leaks reported in MySQL Server turned out to be prepare statements or cursors which were forgotten to be closed. Watch Com_stmt_prepare and Com_stmt_close to see if you’re closing all prepared statements. In newer versions you can also use prepared_stmt_count variable to track number of open statements diretly. You can also adjust max_prepared_stmt_count variable which limits how many statements can be open at the same time to avoid overload.
Check execution rate You can take a look at Com_stmt_execute to Com_stmt_prepare ratio to see how many times each statement is executed. If it is executed only once standard statements might be better for you, or might be you just need to cache prepared statements in your application (avoid closing and recycle). Note in some cases it still can be better to use prepared statements even if you use each only once due to data conversion.
Try them if API supports fast switch The great thing I love about Java MySQL API is it can switch between prepared and non-prepared statements transparently (in most cases). So you can try one way and another and see which one works best. Some other API have the same advantage.
In general I hope there is great future ahead for MySQL Prepared statements – many of the problems such as extra round trip, non working query cache or non working prepared statements for some of the statements as well as most of other problems can be fixed. MySQL Development team just need some resources allocated to complete this feature.