MySQL Default Configuration Changes between 5.6 and 5.7Bradley Mickel
MySQL 5.7 has added a variety of new features that might excite you. However, there are also changes in the current variables that you might have overlooked. MySQL 5.7 updated nearly 40 of the defaults from 5.6. Some of the changes could severely impact your server performance, while others might go unnoticed. I’m going to go over each of the changes and what they mean.
The change that can have the largest impact on your server is likely sync_binlog. My colleague, Roel Van de Paar, wrote about this impact in depth in another blog post, so I won’t go in much detail. Sync_binlog controls how MySQL flushes the binlog to disk. The new value of 1 forces MySQL to write every transaction to disk prior to committing. Previously, MySQL did not force flushing the binlog, and trusted the OS to decide when to flush the binlog.
The performance schema variables stand out as unusual, as many have a default of -1. MySQL uses this notation to call out variables that are automatically adjusted. The only performance schema variable change that doesn’t adjust itself is performance_schema_max_file_classes. This is the number of file instruments used for the performance schema. It’s unlikely you will ever need to alter it.
The optimizer_switch, and sql_mode variables have a variety of options that can each be enabled and cause a slightly different action to occur. MySQL 5.7 enables both variables for flags, increasing their sensitivity and security. These additions make the optimizer more efficient in determining how to correctly interpret your queries.
Three flags have been added to the optimzer_switch, all of which existed in MySQL 5.6 and were set as the default in MySQL 5.7 (with the intent to increase the optimizer’s efficiency): duplicateweedout=on, condition_fanout_filter=on, and derived_merge=on. duplicateweedout is part of the optimizer’s semi join materialization strategy. condition_fanout_filter controls the use of condition filtering, and derived_merge controls the merging of derived tables, and views into the outer query block.
The additions to SQL mode do not affect performance directly, however they will improve the way you write queries (which can increase performance). Some notable changes include requiring all fields in a select … group by statement must either be aggregated using a function like SUM, or be in the group by clause. MySQL will not assume they should be grouped, and will raise an error if a field is missing. Strict_trans_tables causes a different effect depending on if it used with a transactional table.
Statements are rolled back on transaction tables if there is an invalid or missing value in a data change statement. For tables that do not use a transactional engine, MySQL’s behavior depends on the row in which the invalid data occurs. If it is the first row, then the behavior matches that of a transactional engine. If not, then the invalid value is converted to the closest valid value, or the default value for the columns. A warning is generated, but the data is still inserted.
There have been a couple of variable changes surrounding the binlog. MySQL 5.7 updated the binlog_error_action so that if there is an error while writing to the binlog, the server aborts. These kind of incidents are rare, but cause a big impact to your application and replication when they occurs, as the server will not perform any further transactions until corrected.
The binlog default format was changed to ROW, instead of the previously used statement format. Statement writes less data to the logs. However there are many statements that cannot be replicated correctly, including “update … order by rand()”. These non-deterministic statements could result in different resultsets on the master and slave. The change to Row format writes more data to the binlog, but the information is more accurate and ensures correct replication.
MySQL has begun to focus on replication using GTID’s instead of the traditional binlog position. When MySQL is started, or restarted, it must generate a list of the previously used GTIDs. If binlog_gtid_simple_recovery is OFF, or FALSE, then the server starts with the newest binlog and iterates backwards through the binlog files searching for a previous_gtids_log_event. With it set to ON, or TRUE, then the server only reviews the newest and oldest binlog files and computes the used gtids. Binlog_gtid_simple_recovery makes it much faster to identify the binlogs, especially if there are a large number of binary logs without GTID events. However, in specific cases it could cause gtid_executed and gtid_purged to be populated incorrectly. This should only happen when the newest binarly log was generated by MySQL5.7.5 or older, or if a SET GTID_PURGED statement was run on MySQL earlier than version 5.7.7.
Another replication-based variable updated in 5.7 is slave_net_timeout. It is lowered to only 60 seconds. Previously the replication thread would not consider it’s connection to the master broken until the problem existed for at least an hour. This change informs you much sooner if there is a connectivity problem, and ensures replication does not fall behind significantly before informing you of an issue.
InnoDB buffer pool changes impact how long starting and stopping the server takes. innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup are used together to prevent you from having to “warm up” the server. As the names suggest, this causes a buffer pool dump at shutdown and load at startup. Even though you might have a buffer pool of 100’s of gigabytes, you will not need to reserve the same amount of space on disk, as the data written is much smaller. The only things written to disk for this is the information necessary to locate the actual data, the tablespace and page IDs.
MySQL now made some of the options implemented in InnoDB during 5.6 and earlier into its defaults. InnoDB’s checksum algorithm was updated from innodb to crc32, allowing you to benefit from the hardware acceleration recent Intel CPU’s have available.
The Barracuda file format has been available since 5.5, but had many improvements in 5.6. It is now the default in 5.7. The Barracuda format allows you to use the compressed and dynamic row formats. My colleague Alexey has written about the utilization of the compressed format and the results he saw when optimizing a server: https://www.percona.com/blog/2008/04/23/real-life-use-case-for-barracuda-innodb-file-format/
The innodb_large_prefix defaults to “on”, and when combined with the Barracuda file format allows for creating larger index key prefixes, up to 3072 bytes. This allows larger text fields to benefit from an index. If this is set to “off”, or the row format is not either dynamic or compressed, any index prefix larger than 767 bytes gets silently be truncated. MySQL has introduced larger InnoDB page sizes (32k and 64k) in 5.7.6.
MySQL 5.7 increased the innodb_log_buffer_size value as well. InnoDB uses the log buffer to log transactions prior to writing them to disk in the binary log. The increased size allows the log to flush to the disk less often, reducing IO, and allows larger transactions to fit in the log without having to write to disk before committing.
MySQL 5.7 moved InnoDB’s purge operations to a background thread in order to reduce the thread contention in MySQL 5.5.The latest version increases the default to four purge threads, but can be changed to have anywhere from 1 to 32 threads.
MySQL 5.7 now enables innodb_strict_mode by default, turning some of the warnings into errors. Syntax errors in create table, alter table, create index, and optimize table statements generate errors and force the user to correct them prior to running. It also enables a record size check, ensuring that insert or update statements will not fail due to the record being too large for the selected page size.
MySQL has increased the number of times the optimizer dives into the index when evaluating equality ranges. If the optimizer needs to dive into the index more than the eq_range_index_dive_limit , defaulted to 200 in MySQL 5.7, then it uses the existing index statistics. You can adjust this limit from 0, eliminating index dives, to 4294967295. This can have a significant impact to query performance since the table statistics are based on the cardinality of a random sample. This can cause the optimizer to estimate a much larger set of rows to review than it would with the index dives, changing the method the optimizer chooses to execute the query.
MySQL 5.7 deprecated log_warnings. The new preference is utilize log_error_verbosity. By default this is set to 3, and logs errors, warnings, and notes to the error log. You can alter this to 1 (log errors only) or 2 (log errors and warnings). When consulting the error log, verbosity is often a good thing. However this increases the IO and disk space needed for the error log.
table_open_cache_instances has changed starting with MySQL 5.7.8. The number of instances was increased from 1, to 16. This has two main benefits, reducing contention of DML statements, and segmenting access to the cache. By increasing the number of instances, a DML statement can lock a single instance, and not the entire cache. A thing to note is that the DDL statements still require a lock on the entire cache. On systems where a large number of sessions are accessing the cache it increases the performance.
There are many changes to the defaults in 5.7. But many of these options have existed for a long time and should be familiar to users. Many people used these variables, and they are the best method to push MySQL forward. Remember, however, you can still edit these variables, and configure them to ensure that your server works it’s best for your data.