MySQL Replication: ‘Got fatal error 1236’ causes and cures

MySQL Replication: 'Got fatal error 1236' causes and curesMySQL replication is a core process for maintaining multiple copies of data – and replication is a very important aspect in database administration. In order to synchronize data between master and slaves you need to make sure that data transfers smoothly, and to do so you need to act promptly regarding replication errors to continue data synchronization. Here on the Percona Support team, we often help customers with replication broken-related issues. In this post, I’ll highlight the top most critical replication error code 1236 along with the causes and cure.

‘Got fatal error 1236’ triggers

This MySQL replication error can be triggered by multiple reasons and I will try to cover all of them.

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event ‘binlog.000201’ at 5480571

This is a typical error on the slave(s) server. It reflects the problem around max_allowed_packet size. max_allowed_packet refers to single SQL statement sent to the MySQL server as a binary log event from master to slave. This error usually occurs when you have a different size of max_allowed_packet on the master and slave (i.e. master max_allowed_packet size is greater than slave server). When the MySQL master server tries to send a bigger packet than defined on the slave server,  the slave server then fails to accept it and hence the error. In order to alleviate this issue please make sure to have the same value for max_allowed_packet on both slave and master. You can read more about max_allowed_packet here.

This error usually occurs when updating a huge number of rows on the master and it doesn’t fit into the value of slave max_allowed_packet size because slave max_allowed_packet size is lower than the master. This usually happens with queries “LOAD DATA INFILE” or “INSERT .. SELECT” queries. As per my experience, this can also be caused by application logic that can generate a huge INSERT with junk data. Take into account that one new variable introduced in MySQL 5.6.6 and later slave_max_allowed_packet_size controls the maximum packet size for the replication threads. It overrides the max_allowed_packet variable on slave and it’s default value is 1 GB. In this post, “max_allowed_packet and binary log corruption in MySQL”, my colleague Miguel Angel Nieto explains this error in detail.

Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

This error occurs when the slave server required binary log for replication no longer exists on the master database server. In one of the scenarios for this, your slave server is stopped for some reason for a few hours/days and when you resume replication on the slave it fails with the above error.

When you investigate you will find that the master server is no longer requesting binary logs which the slave server needs to pull in order to synchronize data. Possible reasons for this include the master server expired binary logs via system variable expire_logs_days – or someone manually deleted binary logs from master via PURGE BINARY LOGS command or via ‘rm -f’ command or maybe you have some cronjob which archives older binary logs to claim disk space, etc. So, make sure you always have the required binary logs exists on the master server and you can update your procedures to keep binary logs that the slave server requires by monitoring the “Relay_master_log_file” variable from SHOW SLAVE STATUS output. Moreover, if you have set expire_log_days in my.cnf old binlogs expire automatically and are removed. This means when MySQL opens a new binlog file, it checks the older binlogs, and purges any that are older than the value of expire_logs_days (in days). Percona Server added a feature to expire logs based on the total number of files used instead of the age of the binlog files. So in that configuration, if you get a spike of traffic, it could cause binlogs to disappear sooner than you expect. For more information check Restricting the number of binlog files.

In order to resolve this problem, the only clean solution I can think of is to re-create the slave server from a master server backup or from other slave in replication topology.

— Got fatal error 1236 from master when reading data from binary log: ‘binlog truncated in the middle of event; consider out of disk space on master; the first event ‘mysql-bin.000525’ at 175770780, the last event read from ‘/data/mysql/repl/mysql-bin.000525’ at 175770780, the last byte read from ‘/data/mysql/repl/mysql-bin.000525′ at 175771648.’

Usually, this caused by sync_binlog <>1 on the master server which means binary log events may not be synchronized on the disk. There might be a committed SQL statement or row change (depending on your replication format) on the master that did not make it to the slave because the event is truncated. The solution would be to move the slave thread to the next available binary log and initialize slave thread with the first available position on binary log as below:

— [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.010711’ at 55212580, the last event read from ‘/var/lib/mysql/log/mysql-bin.000711’ at 4, the last byte read from ‘/var/lib/mysql/log/mysql-bin.010711′ at 4.’, Error_code: 1236

I foresee master server crashed or rebooted and hence binary log events not synchronized on disk. This usually happens when sync_binlog != 1 on the master. You can investigate it as inspecting binary log contents as below:

You will find this is the last position of binary log and end of binary log file. This issue can usually be fixed by moving the slave to the next binary log. In this case, it would be:

This will resume replication.

To avoid corrupted binlogs on the master, enabling sync_binlog=1 on master helps in most cases. sync_binlog=1 will synchronize the binary log to disk after every commit. sync_binlog makes MySQL perform on fsync on the binary log in addition to the fsync by InnoDB. As a reminder, it has some cost impact as it will synchronize the write-to-binary log on disk after every commit. On the other hand, sync_binlog=1 overhead can be very minimal or negligible if the disk subsystem is SSD along with battery-backed cache (BBU). You can read more about this here in the manual.

sync_binlog is a dynamic option that you can enable on the fly. Here’s how:

To make the change persistent across reboot, you can add this parameter in my.cnf.

As a side note, along with replication fixes, it is always a better option to make sure your replica is in the master and to validate data between master/slaves. Fortunately, Percona Toolkit has tools for this purpose: pt-table-checksum & pt-table-sync. Before checking for replication consistency, be sure to check the replication environment and then, later, to sync any differences.

Share this post

Comments (9)

  • Mark Grennan

    Thanks! This was exactly my problem. Your explanation was clear and I fixed my server quickly.

    December 17, 2014 at 4:28 pm
  • Tu Nguyen

    hi Muhammad Irfan, in my case, my master crashed. At that time, my slave became master (data wrote on slave). Then master reboot.
    I still want master is master (not slave becomes master) but there is error 1236 just exactly the last sample on your post.
    What should I do?
    Thanks in advances,

    December 18, 2014 at 9:17 pm
  • André Luiz de Almeida

    Thanks man

    Very clear and very simple, Solved my situation here too.

    May 25, 2015 at 5:06 pm
  • Brent Clark

    What if the message is not ….*Could not find first log*
    but instead, ‘could not find next log;

    Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘could not find next log; the first event ‘mysql-bin.000208’ at 8223714, the last event read from ‘/var/log/mysql/mysql-bin.000511’ at 104858051, the last byte read from ‘/var/log/mysql/mysql-bin.000511′ at 104858051.’

    June 7, 2015 at 9:58 pm
  • Gauravkumar Mishra

    Hi Brent , even I faced the same issue. My error looks like “Got fatal error 1236 from master when reading data from binary log: ‘could not find next log; the first event ‘mysql-bin.000784’ at 203062471, the last event read from ‘/ntss_data/mysql/mysql-bin.000818’ at 1073742606, the last byte read from ‘/ntss_data/mysql/mysql-bin.000818’ at 1073742606.'” Weer you able to figure out the cause of this issue?

    September 7, 2015 at 6:21 am
  • Jonathan Nicol

    Brent and Gauravkumar: I just had the same issue. I couldn’t figure out the root cause, but simply running ‘start slave;’ on the slave fixed it.

    September 21, 2015 at 4:54 pm
  • Sungwon Um

    I found another case for this error message :

    Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

    The issue occurred after I update mysql config file (/etc/my.cnf) in master DB.
    In config file, log-bin was set as ‘/var/lib/mysql/mysql-bin’.
    I changed it into ‘mysql-bin’.

    Then, the error above occurs in slave databases.

    To resolve this case,
    I opened the bin-log index file (mysql-bin.index) from master DB.

    It showed like this.

    I edited old bin-log URI format to new one

    Then restart master db, stop slave and start slave in order.
    The issue has gone.

    November 25, 2015 at 7:38 pm
  • Scott Warren

    Thanks for the guide it was helpful fixing our problem.

    I just wanted to point out that in this command you’re using 10711.

    $ mysqlbinlog –base64-output=decode-rows –verbose –verbose –start-position=55212580 mysql-bin.010711

    And in the command below you’re using 712 but you say it should be the next file. I thought you might want to correct this so it causes less confusion (I powered through it but others might not :-))

    mysql> CHANGE MASTER TO MASTER_LOG_FILE=’mysql-bin.000712′, MASTER_LOG_POS=4;

    March 3, 2016 at 8:30 am
  • Max T.

    Using fix for “binlog truncated in the middle of event” replication might be broken down if the binlog is actually correct and contains events after mentioned position (the binlog file can be checked by “mysqlbinlog”). Instead of switching to the next file, “reset slave” and “change master” with the event start position can be used.

    October 24, 2017 at 4:16 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.