October 25, 2014

How expensive is MySQL Replication for the Master

I generally thought about MySQL replication as being quite low overhead on Master, depending on number of Slaves. What kind of load extra Slave causes ? Well it just gets a copy of binary log streamed to it. All slaves typically get few last events in binary log so it is in cash. In most cases having some 10 slaves are relatively low overhead not impacting master performance a lot. I however ran into the case when performance was significantly impacted by going from 2 to 4 slave on the box. Why ?

The reason was Master was having a lot of very small updates going on – over 15.000 transactions a second. Each time event is logged to the binary log all 4 threads feeding binary logs to the slave were to be woken up to send the update notification. With 4 slaves connected this makes 60.000 of threads being woken up sending some 60.000 packets (there may be some buffering taking place on TCP side merging multiple sends but still)

I guess this scenario is just not really caught any developer attention yet as it should be rather easy to optimize. Same as network cards are designed to throttle numbers of interrupts they get and process several packets at the time we could make replication threads to be woken up in the batches. For example we could tune the system to wake up the thread feeding slave no more often than 1000 times a second and each wake up even would send multiple events to the slave. It should be possible to make this number tunable as more rare wakeups are less overhead but they are also can impact replication latency a bit. It is also possible to get some auto detection in place timing how long it really takes all threads to send their data to the slave. If you have large amount of slaves the delay from the event executed on the master to last thread sends packet to the slave can be significant.

What does this case teach me in general ? To always look for data rather and question your assumptions. If something is unlikely to be the bottleneck it does not mean it is not :)

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Bay says:

    correct me if I am wrong, blackhole engine has its limitations, only has insert trigger, no update and delete trigger, can not support auto-increment keys, it is not reliable and often break replications.

  2. Hi Peter,

    Just out of curiosity, what’s the highest update rate you have ever seen going into the binlog? I’m working on getting Tungsten Replicator to support much higher rates than we do currently and it’s helpful to understand the design ceiling. MySQL can jam an amazing amount of data into slaves.

    Cheers, Robert

  3. zerkms says:

    why not use blackhole as an intermediate layer between master and slaves?

  4. peter says:

    Robert,

    My idea is about replication is simple – any traffic Master server is able to handle it should be able to replicate onto the slave if it is the same configuration. I also would like to see replication not taking too much overhead on the master. Currently you can get about 50.000 simple transactions a second (something like update by primary key) so this is the number.

  5. peter says:

    zerkms,

    You can use blackhole to build the tiered replication of course but it is complication and the overhead is not going away in this case – you will still need to spend a lot of CPU cycles if you’re feeding many slaves with current architecture they can just be split against several intermediatery slaves.

  6. zerkms says:

    hmmm…… why overhead not going avay?
    it will be removed to dedicated “blank”-blackholed server and the master now will have to serve just 1 slave, instead of N.

  7. peter says:

    Yes. It will be moved to the different box but not going away :)

  8. zerkms says:

    yeah, but the discussion is around topic sounds like “How expensive is MySQL Replication for the Master” ;-)
    and in this case it costs in only one slave at all ;-)

  9. Mark Callaghan says:

    @zerkms – there is a cost from buying and managing the extra server. I don’t want to manage another server.

  10. zerkms says:

    @Mark
    nice pov ;-) but money isn’t the absolute measure for “cost”. due to facebook practice, let imagine 1 (one) master server which serve 4 datacenters. what will be it’s bandwidth?

    ps: like your posts at MySQL@Facebook ;-)

  11. Mark Callaghan says:

    I think I want the blackhole slave to run on the same server as the master. That will eliminate the cost of extra hardware and networking. The cost of managing the extra servers still exists. I think the rise of the cloud and MySQL running on it will spur the development of frameworks that make it easier to manage large numbers of mysql instances. I don’t think we have good frameworks yet that make it easy to manage the extra servers.

  12. Diego Cassinera says:

    Good idea regarding the interrupts, however, if you are talking about the cost of slaves you did not get into memory usage. In order for the server to deal with the different lag of each slave, it need to have a stream of operations for each slave. In most cases most of the slaves are on the same binlog, but when they are not multiple binlogs will be in memory,

    On large systems, it makes total sense to stream a binlog to one system, and this system propagates to the rest.
    Using black hole you could do this in a pyramid fashion.

  13. Bay says:

    correct me if I am wrong, blackhole engine has its limitations, only has insert trigger, no update and delete trigger, can not support auto-increment keys, it is not reliable and often break replications.

Speak Your Mind

*