October 25, 2014

MySQL Performance Blog was down today

MySQL Performance Blog (and percona.com too) were down today because the switch in our rack died completely. It took a while to fix it using secondary switch we had. Provider was not willing to do it as remote hands so I had to drive to the data center to fix it.

We got number of calls and messages from the customers and friends about web site going down so we probably have to invest into getting infrastructure more redundant – currently we were quite cheap and a lot of servers have single network card (so you can’t use trunking to eliminate switch as single point of failure).

The customer case management systems were not affected by this outage.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Alex says:

    wow, switch the provider! if a provider is not willing to do something, thats just wrong. Not doing something for free – ok, but not doing it at all … lost a customer!

  2. Przemek says:

    IMO the simplest and cheapest solution for eliminating switch as a SPOF is to put second NIC interface into each server and use bonding – just each second NIC would be connected to the second switch.

  3. Wagner Bianchi says:

    Hi Peter,

    Sometimes this is happen either. Go ahead…successes!

Speak Your Mind

*