Customer case - Finding an unusual cause of max_user_connections
This issue happened to me at work. A customer encountered max_user_connections repeatedly, and it was our duty to investigate what went wrong : if this is a problem with a query, find it and give advice to the customer, or if it's server related, fix it with minimal downtime.
In this session, I'll explain my thoughts and my reasoning, step by step. This could give you (the audience) some ideas if you have to deal with similar issues, or just discover useful tools that would help you one day.
Quick overview : that was (way) more difficult than just finding a sub-optimized SELECT :)
Part I - System / application used and monitoring
MySQL 5.5 (percona Version) with no replication
Server is a Quad CPU with 128GB RAM (kernel 2.6)
Application is an online game written entirely in PHP.
Web part is a cluster of ~ 30 servers (double Xeon CPU)
Quick note on what to monitor, frequence, and usual mistakes made if you make some wrong assumptions.
Part II - Checking what we see and checking variables
Presentation of tools used, like pt-stalk, pt-summary, pt-mext, top, and some bash.
Part III - Finding a first quick fix, to restore production
Part IV - Investigations to find a durable solution
Usage of tcpdump on high trafic server + Wireshark
Part V - Explaining ER_USER_LIMIT_REACHED and ER_TOO_MANY_USER_CONNECTIONS
Part VI - Finding a bug report in bugs.mysql.com or Launchpad (for Percona software), and filling one with useful informations for developers
Part VII - Investigating a stall
Usage of opcontrol
Part VIII - Conclusions and personal thoughts on this journey