read_buffer_size can break your replication

PREVIOUS POST
NEXT POST

There are some variables that can affect the replication behavior and sometimes cause some big troubles. In this post I’m going to talk about read_buffer_size and how this variable together with max_allowed_packet can break your replication.

The setup is a master-master replication with the following values:

max_allowed_packet = 32M
read_buffer_size = 100M

To break the replication I’m going to load the 4 million rows with LOAD DATA INFILE:

MasterA (test) > LOAD DATA INFILE '/tmp/data' INTO TABLE t;
Query OK, 4510080 rows affected (26.89 sec)

After some time the SHOW SLAVE STATUS on MasterA gives us this output:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event 'mysql-bin.000002' at 74416925, the last event read from './mysql-bin.000004' at 171, the last byte read from './mysql-bin.000004' at 190.'

Very strange! We have loaded the data on MasterA and now it has the replication broken with a error on max_allowed_packet. The next step is to check the binary logs of both masters.

MasterA:

masterA> mysqlbinlog data/mysql-bin.000004 | grep block_len
#Begin_load_query: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 4194304
#Append_block: file_id: 1 block_len: 33554432
#Append_block: file_id: 1 block_len: 10420608

No block is larger than the max_allowed_packet (33554432).

MasterB:

masterB> mysqlbinlog data/mysql-bin.000004 | grep block_len
#Begin_load_query: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 4191676
#Append_block: file_id: 1 block_len: 33555308
#Append_block: file_id: 1 block_len: 10419732

Do you see the difference? 33555308 is larger than the max_allowed_packet size 33554432 so the Master2 has written the some blocks 876 bytes larger than the safe value. Then, MasterA tries to read again the binary log from MasterB and the replication breaks because packets are too large. No, the replicate_same_server_id is not enabled :)

What is the relation between read_buffer_size and this bug?

Again, an example is best than words. These are the new values:

max_allowed_packet = 32M
read_buffer_size = 16M

We run the LOAD DATA INFILE again and now this is the output on both servers:

#Begin_load_query: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 16777216
#Append_block: file_id: 1 block_len: 14614912

The maximum size of the data blocks are based on the read_buffer_size value so they are never larger than the max_allowed_packet :)

Therefore, a read_buffer_size value larger than max_allowed_packet can break your replication while importing data to MySQL. This bug affects from 5.0.x to the latest 5.5.25 release and the easiest workaround is to not have a read_buffer_size larger than max_allowed_packet. The bug 30435 seems not to be really solved.

And remember, big values on read_buffer_size will not increase your performance.

PREVIOUS POST
NEXT POST

Comments

  1. says

    Nice catch Miguel,

    Looks like the minimum of max_allowed_packet and read_buffer_size is attempted to be used but because of header or something else the event in replication log ends up being larger than max_allowed_packet. I’m wondering why does it take so long to fix.

    It is also worth to note such large read_buffer_size is usually not a good idea.

    What exactly 5.5 release did you test it ? Couple of years from now people will be finding this blog post and “latest 5.5 release” will not mean the same thing to them.

  2. vishnu rao says

    cool find.

    very curious to know if –log-slave-updates was enabled on master2 ?
    also, could this happen with normal insert statements instead of load data infile ?

  3. Justin Swanhart says

    Hi vishnu,

    In order for this to happen, –log-slave-updates had to be enabled on the master2, otherwise master2 would not have rewritten the data packets into its binary log, to then read back by master1.

  4. ruochen says

    the maximum of max_allowed_packet is 1GB in mysql document,i set this values in my.cnf as usually,could you tell me the risk of it.thanks.

  5. says

    hi Miguel…

    I couldn’t find the comment about “block_len” in my binary log.

    I also did testing following your above two variables. But there are some differences. I just set up the simple master-slave replication on MySQL 5.6.5-m8. Then to run load the size of more than 1GB datafile on master, the replication is still working.

    As we know, read_buffer_size is only related to the queries on MyISAM. In my point actually, the variable is used on the storage level. but replication is working on the mysqld level. so I am puzzled.

  6. says

    ruochen: I don’t see any problem of using a large max_allowed_packet value, it only sets a limit on the packet that you are going to send being 1GB is the limit of the protocol.

    zhang: I didn’t tried it with 5.6. anyway, the problem comes when the slave writes the chunks to its own binary log (log_slave_updates) so if you have a master master A-B-A or a chain replication A-B-C then the problem starts after the first slave.

  7. Manohar says

    This error may also occur when you are try to read a log position on master binlog which doesn’t exist. So this is also a case to consider.

Leave a Reply

Your email address will not be published. Required fields are marked *