November 23, 2014

Are PHP persistent connections evil ?

As you probably know PHP “mysql” extension supported persistent connections but they were disabled in new “mysqli” extension, which is probably one of the reasons some people delay migration to this extension.

The reason behind using persistent connections is of course reducing number of connects which are rather expensive, even though they are much faster with MySQL than with most other databases.

Not only connects are expensive but you also may run into the trouble establishing number of connections you need. The problem is there can be only so many connections active between Host “Apache” and Host “MySQL”: Port 3306 as connection in TCP/IP protocol is identified by pair of IP addresses and pair of ports (local port and remote port). Yes if you’re establishing thousands of connections per second you normally do not keep it open for long time, but Operation System does. According to TCP/IP protocol Ports can’t be recycled instantly and have to spend some time in “FIN” stage waiting before they can be recycled.

On Linux you can adjust “/proc/sys/net/ipv4/ip_local_port_range” to get more local ports available and “/proc/sys/net/ipv4/tcp_fin_timeout” to reduce recycle delay. Reducing last one however will go against protocol requirements so in theory you can get some problems. It worked fine for me however.

Other ways to workaround this problem is of course to use multiple IPs on your MySQL server (you’re probably using Intranet range IPs anyway). So one way or around you can avoid such limit but creating connections will still waste resources and add latency.

The other problem with persistent connections is using too many MySQL server connections. Some people simply do not realize you can increase max_connections variable and get over 100 concurrent connections with MySQL others were beaten by older Linux problems of not being able to have more than 1024 connections with MySQL.

With modern systems you can have thousands of Connections with MySQL, it however might not be overly efficient – managing large number of threads may be a bit more expensive. Threads also take memory resources both on kernel and MySQL size but most problems seems to happen in case of “overload”. In case your number of connections is limited you start getting connection errors which are easy to handle, in case of large number of connections allowed you may have 4000 of queries running at the same time which may never self-resolve as users will get extremely poor response time and will continue press reload. Plus, especially for Innodb tables throughput may drop dramatically – you may see it to be 1/100 of what you get with just few queries concurrently.
If you have these queries doing large sorts, using temporary tables or having other significant memory requirements you may well out of memory and get crash or have MySQL starting to swap aggressively.

This is where connection pooling would be extremely helpful but with default Processed based PHP installations it does not work.

Interesting enough these problem of thousands of connections in most cases comes from misconfiguration or possibly not willing to spend enough time to optimize Web part of configuration. Thousands of concurrent connections usually result from cases when there are many hundreds of apache children processes are running. Most of them will be just holding keep-alive while still keeping MySQL connection open, others will serve static content such as images which also does not need MySQL connection open.

In optimal configuration with Apache talking to local MySQL installation hand having no remote network accesses would be 20-30 apache children. But you need to keep them busy all the time so they should not handle keep alive serve images or perform spoon feeding. You can place squid in front, use apache proxy module or even use lighttpd with FastCGI all can fix this problem.

Lets talk now about why Persistent connections were disabled in mysqli extension. Even though you could misuse persistent connections and get poor performance that was not the reason. The real reason is – you could get much more problems with it.

Persistent connections were added to PHP during times of MySQL 3.22/3.23 when MySQL was simple enough so you could recycle connections easily without any problems. In later versions number of problems however arose – If you recycle connection which has uncommitted transactions you run into trouble. If you happen to recycle connections with custom character set settings you’re in trouble back again, not to mention about possibly changed per session variables.

Many applications are fine – ie if you have read-only page and everything on your site uses same charset and does not use per session variables in some tricky way it would work perfect for you. In some complex applications it however can lead to very hard to track bugs.

What would be proper solution to this ?

MySQL should support command which resets connection and guarantees it will be the same as freshly created connection, so the fact connection is persistent can be completely transparent. May be it even should get fresh connection ID so if you track web requests in MySQL Logs by connection ID it would still work. MySQL has change_user() call which kind of does some of this but it looks like no one knows if it does everything what needs to be done, plus it surely was broken in some MySQL versions.

So persistent connections are not evil but they need some MySQL support to be properly implemented. Now with new “mysqlnd” driver for PHP effort by MySQL team I hope this issue could be fixed. Internal guys sure have more leverage on server development team than external community :)

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Brian Moon says:

    The other issue with using persistent connections is that they are simply not needed in a well designed web based application. Not every web request on a web site will (if properly designed) need a mysql connection. Most likely, if you are requiring a connection on every Apahce request, your application will not scale and will start having issues very quickly.

    Another issue that arrises is that a user will install several “off the shelf” applications, put them all in different databases, and then use persistent connections. Now, the blog, the user counter, the ad program and the forum all make their own connection. So, each Apache process has 4 persistent connections to the database. Solve that user error issue (and saying that dumb users should fix their stuff is not good enough) and maybe persistent connections are valid again.

  2. Brian is working on a patch to allow 7000+ connections with only 700 allowed to be active at any given time. With that it makes pconnect a little less evil.

  3. peter says:

    Brian,

    I can’t agree with you on this. Some applications do not need persistent connections other will benefit from them quite seriously. Now if web page hit does not require connection you do not have to establish it so with proper design overhead will be low.

    Now if you’re speaking about having multiple offshelf applications sharing same MySQL Server you’re quite likely speaking about light duty applications this is not something persistent connections are designed for.

    I would agree persistent connections are often misused. Unless you have hundreds of connections per second you quite likely do not need persistent connections and unlikely will get major benefit from using them. On the contrary however many people used to always use persistent connections without thinking what they really do.

    It is kind of similar as with Indexes you can find some people putting random indexes on all columns without giving thought what these :)

  4. peter says:

    Datan,

    That would be interesting. Are you speaking about worker thread model or something else ?

    Also Vadim actually had prototype patch for limiting number number of active thread over a year ago but at that time it did not get much of interest within MySQL development team.

  5. Brice says:

    Did anyone tested persistent connection to SQL Relay (http://sqlrelay.sourceforge.net/) which in turns connects to a bunch of mysql server?

  6. peter says:

    I have not seen sqlrelay used much in practice. Myself I do not think such sort of mediator is good idea – it adds up complication and adds latency. It may be good in some extreme cases when you badly need connection to be persistent but you can’t have so many connections to MySQL.

    Would be interesting however to see some real benchmarks with it.

  7. Wren says:

    We’re using SQL Relay in a production environment to pool connections between 12 Apache Web servers and a handful of Oracle databases. Persistent connections are certainly evil once you have enough Web servers maintaining X number of persistent connections per Web server to your database(s). Pretty soon, you’ve got thousands of connections to the database, most of which aren’t even in use concurrently, and then the database starts to run out of memory. SQL Relay takes on this burden, freeing the database to focus on running the queries instead of managing a lot of connections. In practice, yes, this middle man does increase overall complexity and you have to think about redundancy too, but if you have a busy site you probably won’t have much choice after a certain point!

    The nice thing about persistent connections is that they reduce latency on the connect/disconnect/reconnect which, for Oracle, is more expensive than with MySQL. The good news is that SQL Relay’s performance here is comparable to PHP’s persistent connections. The reason is because SQL Relay has already established X number of Oracle (or MySQL) connections and strives to maintain that number continually. Then, your PHP app connects and disconnects from the SQL Relay server, which is lightweight.

    Getting everything to run smoothly does take a fair bit of tuning and careful management of your SQL Relay connections. If you have leaky code that doesn’t disconnect properly, you can still flood your connection pool, no pun intended. Wish I had benchmarks but, the main thing is that SQL Relay offers similar performance and won’t flood your database with connections, both of which are good things.

  8. Son Nguyen says:

    For an advertisement serving engine with a real/serious degree of concurrency and fairly simple queries, support for persistent connections has been very useful for us. It looks like there are ways to implement connection pooling as I just checked the PHP documentation on MySQLi extension.

  9. peter says:

    Sure. if queries are simple persistent connection are helpful. It is good mysqlnd driver which is being developed for PHP 6 (and lower in the future) will support them.

  10. raci says:

    how to increasr max connection limit in mysql i am getting error too many connection as i am using Shared hosting
    ,. how to change the setting in Cpanel

  11. Please help! in PHP mySQL, the use of SELECT FOR UPDATE and SELECT LOCK IN SHARE MODE, How to use it? They don’t make any sense to me in a web application, because when you first read the data (SELECT FOR UPDATE)the script ends once it sends the filled form to be updated on the client, so there is an auto mysql_close.(Which would rollback your transaction in the SELECT)
    With a persistent connection, may be it would keep the transaction pending, but there is no way to store the connection persistent link resource in a session or anywhere else, so there is no way to open the updater script with the same link resource to UPDATE and then COMMIT, or how does it works?

  12. Hi, I have a situation (although not with php) that has been troubeling me a long time now.

    I have a Server Application (in-house) which in turn let’s a bunch of clients connect to it. These clients give feebdack.. A LOT of feedback, (think in terms of narrowcasting players which tell me what image they have played at what time and which screen.. such an image is displayed for 10 seconds and then goes to the next.. these are 8460 records per day per screen per player.. ). Next to this, they query data (are there new files, ads etc) and they log certain events..

    Now i tryed doing this with the Mysql.NET connector and opening and closing the connection but up until now, i haven’t got this working yet.. so for now, i’m using a concurrent connection per client that is connected to my server. Performance has never been fantastic though.

    So my question is: Per query connection of keep a concurrent connection per client that is connected ?

  13. oscar duron says:

    I you are only reading data, persistent connection is a good option if not the best. If you are updating data, think twice and undertand the risks and problems using a persistent connection. If you are using transactions, persistent connections do not apply.

  14. Tom says:

    Contrary to Brian’s comment of 3 years ago, it’s in a well-designed web application where persistent connections are beneficial. Badly-designed applications don’t use them either because they have other, bigger performance bottlenecks to worry about, or because they succumb to the gotchas.

    And the potential gotchas of pconnect are all symptoms of bad design themselves: having far too many client processes (think mod_php); forgetting to unlock tables and not cleaning up temporary tables; poor use of session variables, etc.

    I have been running a high-concurrency app on Apache with the worker mpm (up to 3000 threads), with fastCGI + PHP (1 process, 24 children), connecting to mysql via Unix sockets with persistent connections. I have not had any problems. Even if the benefits are small, there is no cost. Net win.

  15. Kyle says:

    I would read this, updated recently. It seems mysqli has added this option back. It does cleanup automatically by calling mysql_change_user() every time a connection is reused.

    http://php.net/manual/en/mysqli.persistconns.php

  16. Here’s a nifty little class which will load balance across multiple replicated MySQL server instances, using persistent connections, automatically removing failed MySQL servers from the pool.

    You would ONLY use this for queries, never inserts/updates/deletes, UNLESS you had a multi-master situation where updates to any database serverautomatically replicate to the other servers (I don’t know whether that’s possible with MySQL).

    Using this class, you get a connection to a MySQL server like this:
    $con = MySQLConnectionFactory::create();

    Here is the class (you’ll need to customize the $SERVERS array for your configuration — note that you would probably use the same username, password and database for all of the servers, just changing the host name, but you’re not forced to use the same ones):

    ‘myHost1′,
    ‘username’ => ‘myUsername1′,
    ‘password’ => ‘myPassword1′,
    ‘database’ => ‘myDatabase1′),
    array(
    ‘host’ => ‘myHost2′,
    ‘username’ => ‘myUsername1′,
    ‘password’ => ‘myPassword2′,
    ‘database’ => ‘myDatabase2′)
    );

    public static function create() {
    // Figure out which connections are open, automatically opening any connections
    // which are failed or not yet opened but can be (re)established.
    $cons = array();
    for ($i = 0, $n = count(MySQLConnectionFactory::$SERVERS); $i

    Lightrains Technolabs

  17. Here’s a nifty little class which will load balance across multiple replicated MySQL server instances, using persistent connections, automatically removing failed MySQL servers from the pool.

    You would ONLY use this for queries, never inserts/updates/deletes, UNLESS you had a multi-master situation where updates to any database serverautomatically replicate to the other servers (I don’t know whether that’s possible with MySQL).

    Using this class, you get a connection to a MySQL server like this:
    $con = MySQLConnectionFactory::create();

    Here is the class (you’ll need to customize the $SERVERS array for your configuration — note that you would probably use the same username, password and database for all of the servers, just changing the host name, but you’re not forced to use the same ones):

    // ‘myHost1′,
    ‘username’ => ‘myUsername1′,
    ‘password’ => ‘myPassword1′,
    ‘database’ => ‘myDatabase1′),
    array(
    ‘host’ => ‘myHost2′,
    ‘username’ => ‘myUsername1′,
    ‘password’ => ‘myPassword2′,
    ‘database’ => ‘myDatabase2′)
    );

    public static function create() {
    // Figure out which connections are open, automatically opening any connections
    // which are failed or not yet opened but can be (re)established.
    $cons = array();
    for ($i = 0, $n = count(MySQLConnectionFactory::$SERVERS); $i //

    Lightrains Technolabs

  18. Hi to Mr. Peter and all Techies in this forum,

    I am nearly to finish my web hosting portal. I am offering affiliate service through my web site. I hope my site will attract more visitors and new registrant. And registered members will start posting my site’s URL in several sites. So, traffic will increase gradually. In this case, which database connection I should use: whether to use MYSQL_CONNECT(), or MYSQL_PCONNECT or MYSQLI_CONNECT. Please clarify all these connections with details and without any confusion. Thanks to all in advance.

Speak Your Mind

*