September 2, 2014

How to debug long-running transactions in MySQL

Among the many things that can cause a “server stall” is a long-running transaction. If a transaction remains open for a very long time without committing, and has modified data, then other transactions could block and fail with a lock wait timeout. The problem is, it can be very difficult to find the offending code […]

Cache Miss Storm

I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything […]

Estimating Replication Capacity

It is easy for MySQL replication to become bottleneck when Master server is not seriously loaded and the more cores and hard drives the get the larger the difference becomes, as long as replication remains single thread process. At the same time it is a lot easier to optimize your system when your replication runs […]

Faster MySQL failover with SELECT mirroring

One of my favorite MySQL configurations for high availability is master-master replication, which is just like normal master-slave replication except that you can fail over in both directions. Aside from MySQL Cluster, which is more special-purpose, this is probably the best general-purpose way to get fast failover and a bunch of other benefits (non-blocking ALTER […]