November 27, 2014

Wishes for new “Pure PHP” MySQL driver

If you’re following MySQL or PHP landscape you should have seen announcement by MySQL to develop pure PHP driver. If not – Here is FAQ .

I’m to meet the team (Georg, Andrey etc) which will be developing this driver during my visit to Open Source Database Conference in November so I thought it would be good idea to gather some wish list for things nice to have in this new driver. Below is my list and I would appreciate to hear your ideas.

Build In Profiling I would like to see how many connections and queries page generated and how long they took – I will place this information in the log. For debugging I’d like to be able to get a table below page output listing all the queries, their exec times and number of rows they are returned. So far it has to be done in inherited class.

Auto Explain Would be very helpful for development to run EXPLAIN on the queries and warn if these is something obviously wrong.

Emulated Prepared Statements I would like to have easy time to switch from PS to standard statements to check bugs and performance difference without changing my application. Also there are bunch of statements which can’t be prepared – I would like library to take care of emulating these for me.

Arrays in Prepared Statements Dealing with certain types of queries, ie queries with IN is so painful with prepared statements – I would like to be able to simply able to pass data array and let library to take care about emulating.

Cached Prepared Statements It would be great if library could cache them so you do not have to worry of tracking same statements in your application and caching them.

Persistent Connections Ajax applications with frequent case of one query per request make persistent connections relevant again.

Automatic load balancing and fall back I would like to be able to pass list of servers system can connect to and let it handle load balancing and fall back by itself, similar as JDBC driver does.

Client size caching Would be very helpful for simple applications which may run query several times per page. I however would like to go further and have TTL based client cache with dynamic back end. I would expect many people would like to see memcached hooked up to it to share result sets among web servers for result sets which can be a bit stale. This might not be the most optimal way to cache things but it would be very easy for many applications.

Query Timeouts Make it easy to specify timeouts for connects and query execution so page would not timeout but you would rather be able to respond with “system overloaded try again later” message.

Exceptions This should be optional as not everyone likes them but would be very helpful in many cases.

Multiple concurrent queries I would like to be able to run multiple queries for multiple MySQL connections at once. For scale out applications data to generate the page may be on many different servers and it is latency killer to be required to query all of them sequentially instead of doing it in parallel. Now people have to do crazy things to work it around.

There are just few thing from the top of my head. There are probably some I forgot about at this point :)

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Lukas says:

    1) Shouldn’t this be implemented on the server?

    2) Yeah it would be nice to define a pre and post callback for queries, like in PEAR::MDB2

    3) PEAR::MDB2 and PDO both have a switch for emulating prepared statements.

    4) In PEAR::MDB2 I have a method called implodeArray() which generates comma separated lists that are quoted accordingly

    5) A connection pooling deamon would be better

    6) Anything to make it easier would be appreciated

    7)/8) Again I think a callback system would be best

    8) Yeah just like PDO

    9) Asynchronus queries might also be cool. This also reminds me there is still a huge wtf? factor with using SP’s via mysqli. I understand why things are the way they are, but it needs to be made easier. Maybe with a dedicated function/method?

  2. peter says:

    Thanks Lukas, I knew you would comment.

    1) You may argue where it should be implemented but it take it as fact it will not be implemented in server for years, while it is possible to implement it in driver rather easily. Also server does not have full picture – ie does not include network communication in account. Server can’t really profile if page has queries going to multiple database servers.

    2) I’m more speaking about something simple and easy to use for beginner users. If you have callback which you need to do something with it is complicated while if you simply can enable explain profiling and get nice table with your query explains and tuning advice simply by enabling explain debug option that would be nice.

    3)I know. It would however nice to have it in native API as well. As you probably know I care more about performance than about portability – for my application chance of going from MySQL to something else is probably less than rewriting them from PHP to something else :)

    4) It is easy enough to do with standard statements ( I guess everyone has little function which does the trick) but it is painful with prepared statements and it is something which could be well build in into the driver as it is very typical task.

    5) Connection pooling daemon is extra overhead. So are persistent connections of course. I think typically you can configure your web layer so you would not need so many connection to get in trouble. Also if someone would implement connection pooling daemon implemented as a proxy for MySQL protocol supporting persistent connections would automatically allow to take advantage of it :)

    6/7) I do not agree. In fact you already can do tons of thing manually. You can create a child from mysqli class and extend it with tons of custom behavior but that would be slow and non-standard. I would like to see something optimized a much as possible and something everyone will be able to easily take advantage off.

    9) What is “huge wtf” ? Yes this is advanced thing which probably needs to be separate. Take a look at curl for example – it can be used in very simple fashion but you also can do parallel query submission if you need to.

  3. Lukas says:

    I mentioned PDO and MDB2 not as a reason for not including this. More as a “here is an example API” for this. Also one thing you should remember is that this is just a libmysql replacement. So all the features you are asking for would also need to be added to mysqli in a separate step. Obviously these features would then only work when using this new native driver.

  4. Rene Leonhardt says:

    It would be great if some of those features would be implemented.

    Sad that another feature is already possible with PDO and PostgreSQL but not MySQL, I could see one or two possible use cases for this as well ;)
    “Background/batch/workflow processing with PDO::PGSQL”
    http://netevil.org/node.php?nid=925&SC=1

  5. peter says:

    Lukas,

    I hope it will be a bit more than libmysql replacement and as there is mentioning of statement caching etc in the Georg’s FAQ it looks like this would be the case. It could be separate step it does not matter I just hope it will be on agenda.

    Yes it is good to study other APIs to see which good features they have :)

  6. peter says:

    Rene,

    Thank for mentioning. This actually brings yet another thing which PostgreSQL has but MySQL does not – ie if you would like application to be informed once certain table is changed or anything it is quite ugly to do with MySQL.

  7. Mike says:

    I’m curious about your statement that “Ajax applications with frequent case of one query per request make persistent connections relevant again.” Persistent connections were irrelevant? I can see where they’d incur some serious overheard on heavily-loaded systems (all my experience is with pretty lightly-loaded applications) but I didn’t know there was any consensus in the PHP/MySQL world that persistent connections were a bad thing. I would think with appropriate limits on the maximum number of persistent connections, Apache itself would essentially act as a connection-pooling daemon. Am I totally off base here?

  8. peter says:

    Mike,

    Persistent connections were never bad thing per say but they had number of problems. Estabilishing new connection in many cases is not significantly more expensive than recycling connection (which also needs special command). Plus on Gbit ethernet connection establishment did not add too much latency anyway. On other hand large number of etablished connections has some overhead in scheduling, memory usage etc.

    If pages are complicated there is very little win for persistant connections as connection creation will constitute very small proportion. For pages running single simple query which is frequently the case with AJAX it becomes significant.

    Now about limiting number of persistent connections – Apache would never act as connection pooling daemon as connections can’t be shared between processes. Simply if there are too many persistent connection as connections are closed they would not be kept as persitent connections but closed at once.

  9. elder says:

    Would be also nice ability to perform XA transactions as in JDBC

  10. Elder,

    It would be rather straightforward to implement the interface to MySQL’s XA implementation in PHP as the interface to XA in MySQL is standard SQL (unlike many other databases, which require native code or protocol-level calls to do the same thing):

    http://dev.mysql.com/doc/refman/5.0/en/xa-statements.html

    The issue is that PHP doesn’t have a transaction monitor, that I know of, so that would be the first thing someone would have to write, or “wrap” (many opensource transaction monitors exist in the Java world).

    You can “fake” the protocol (and we do in the JDBC unit tests), but that doesn’t give you any recoverability if the application itself crashes without an actual logging transaction monitor, which then makes XA pretty much useless in real-world scenarios.

    -Mark

  11. Rene Leonhardt says:

    Peter,

    I wish you a good time at the Google HQ, for a developer like me it would be impressive ;)
    http://mysqlcamp.org/Session_PHP_mysqlnd

    I hope you will post the presentation slides here afterwards.

  12. peter says:

    Rene,

    This is kind of mistake – I’m not going to Google HQ for MySQL Camp – I did not have visa so I could not go :(
    The presentation you’re is from our discussion in Frankfut at International PHP Conference.

  13. rolf says:

    I don’t know whether this is the best venue for this question, but here it is.

    One of my biggest complaints about http+php+mysql is that when a browser stops any impending queries will run to completion on the mysqld side. Will mysqlnd allow these queries to be halted on script termination?

  14. Georg says:

    Rolf,

    currently there is no way to stop the running queries from client side. This could be an issue for the proxy (http://jan.kneschke.de/projects/mysql/mysql-proxy)

  15. Michael says:

    a pure php driver would solve a lot of problems for people using shared web hosting!

    I managed to compile a local php5.3 cgi on a shared host recently and got it working only to find that the mysqlnd client bundled with it won’t talk to the mysql server because its an old version of mysql!

    All attempts to compile php5 to use the old mysql4 driver used by php4 have failed.

    A pure php client could really save the day here!

    while searching for this I came across a pure perl client – maybe this could be ported to php?

    http://search.cpan.org/dist/Net-MySQL/MySQL.pm

Speak Your Mind

*