The doublewrite buffer is a reserved area in the main tablespace of InnoDB/XtraDB. It helps avoiding data corruption when data is flushed from the buffer pool to the disk and a partial page write occurs.
This patch allows you to move the doublewrite buffer from the main tablespace to a separate location.
This option is for advanced users only. See the discussion below to fully understand if you really need to use it.
| Percona-Server Version | Comments |
|---|---|
| 5.1.47-11.0 | Full functionality available. |
| Type | System and command-line variable |
| Scope | Global |
| Dynamic | No |
| Default | none |
Use this option to create a dedicated tablespace for the doublewrite buffer.
This option expects a filename which can be specified either with an absolute or a relative path. A relative path is relative to the data directory.
The following discussion will clarify the improvements made possible by this patch.
InnoDB and XtraDB use many structures, some on disk and others in memory, to manage data as efficiently as possible. To have an overview of the different components see this post. Let's now focus on the doublewrite buffer.
InnoDB/XtraDB uses a reserved area in its main tablespace, called the doublewrite buffer, to prevent data corruption that could occur with partial page writes. When the data in the buffer pool is flushed to disk, InnoDB/XtraDB will flush whole pages at a time (by default 16KB pages) and not just the records that have changed within a page. It means that, if anything unexpected happens during the write, the page can be partially written leading to corrupt data.
With the doublewrite buffer feature, InnoDB/XtraDB first writes the page in the doublewrite buffer and then to the data files.
If a partial page write occurs in the data files, InnoDB/XtraDB will check on recovery if the checksum of the page in the data file is different from the checksum of the page in the doublewrite buffer and thus will know if the page is corrupt or not. If it is corrupt, the recovery process will use the page stored in the doublewrite buffer to restore the correct data.
If a partial write occurs in the doublewrite buffer, the original page is untouched and can be used with the redo logs to recover the data. For further information on the doublewrite buffer, you can see this post.
In usual workloads the performance impact is low-5% or so. As a consequence, you should always enable the doublewrite buffer because the strong guarantee against data corruption is worth the small performance drop.
But if you experience a heavy workload, especially if your data does not fit in the buffer pool, the writes in the doublewrite buffer will compete against the random reads to access the disk. In this case, you can see a sharp performance drop compared to the same workload without the doublewrite buffer-a 30% performance degradation is not uncommon.
Another case when you can see a big performance impact is when the doublewrite buffer is full. Then new writes must wait until entries in the doublewrite buffer are freed.
In a standard InnoDB/XtraDB installation, the doublewrite buffer is located in the main tablespace (whether you activate the innodb_file_per_table or not) and you have no option to control anything about it.
The patch adds an option (innodb_doublewrite_file) to have a dedicated location for the doublewrite buffer.
Basically if you want to improve the I/O activity, you will put the doublewrite buffer on a different disk. But is it better on an SSD or a more traditional HDD? First you should note that pages are written in a circular fashion in the doublewrite buffer and only read on recovery. So the doublewrite buffer performs mostly sequential writes and a few sequential reads. Second HDDs are very good at sequential write if a write cache is enabled, which is not the case of SSDs. Therefore you should choose a fast HDD if you want to see performance benefits from this option. For instance, you could place the redo logs (also written in a sequential way) and the doublewrite buffer on the same disk.
Please report bugs to https://bugs.launchpad.net/percona-project/+filebug and ask general questions in maillist Percona-discussions
Discussion
Why my description about parallelization of write IO is removed? The doublewrite basically semi-serializes the write IO. The characteristic is not depend on its size. If you use good fs and storage about good parallelization of write IO. You should not use doublewrite buffer if you want performance of write IO. Currently, the usual fs of Linux is not so good to parallelize write IO, and the performance impact seems not so big. But in the future, the doublewrite will become ineffective for fast storage and good fs. I think the changing size of doublewrite buffer effects not so good.
My point is the semi-serialization effect is depend on turnaround of write IO to doublewrite buffer. You should choose extreme fast another device. (“not HDD”, like BBU-DRAM disk etc.). The sequential effect is alomost nothing. because it is extremely much write IO to the only 2MB area. So if the HDD has BBU write cache over 2MB only for the doublewrite buffer may be good also.
Sorry for many post…. In the case (2M BBU HDD only for doublewrite), increasing doublewrite buffer over 2MB should cause bad effect. “bigger is not better always”