Customer case - Finding an unusual cause of max_user_connections

Database Administration
4 December 16:20 - 17:10 @ Cromwell Suite

This issue happened to me at work. A customer encountered max_user_connections repeatedly, and it was our duty to investigate what went wrong : if this is a problem with a query, find it and give advice to the customer, or if it's server related, fix it with minimal downtime.

In this session, I'll explain my thoughts and my reasoning, step by step. This could give you (the audience) some ideas if you have to deal with similar issues, or just discover useful tools that would help you one day.

Quick overview : that was (way) more difficult than just finding a sub-optimized SELECT :)

Part I - System / application used and monitoring
MySQL 5.5 (percona Version) with no replication
Server is a Quad CPU with 128GB RAM (kernel 2.6)
Application is an online game written entirely in PHP.
Web part is a cluster of ~ 30 servers (double Xeon CPU)

Quick note on what to monitor, frequence, and usual mistakes made if you make some wrong assumptions.

Part II - Checking what we see and checking variables
Presentation of tools used, like pt-stalk, pt-summary, pt-mext, top, and some bash.

Part III - Finding a first quick fix, to restore production

Part IV - Investigations to find a durable solution
Usage of tcpdump on high trafic server + Wireshark


Part VI - Finding a bug report in or Launchpad (for Percona software), and filling one with useful informations for developers

Part VII - Investigating a stall
Usage of opcontrol

Part VIII - Conclusions and personal thoughts on this journey


Olivier Doucet
Speaker Biography: 
Founder of OXEVA SAS, a hosting company specialized in HA and high trafic websites. My team and I built from scratch infrastructures that were able to sustain more than 12 millions visitors / day. I'm specialized in PHP/MySQL, (server side and client side).