]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 17 min 46 sec ago

The ARCHIVE Storage Engine – does it do what you expect?

January 6, 2014 - 12:00am

Sometimes there is a need for keeping large amounts of old, rarely used data without investing too much on expensive storage. Very often such data doesn’t need to be updated anymore, or the intent is to leave it untouched. I sometimes wonder what I should really suggest to our Support customers.

For this purpose, the archive storage engine, added in MySQL 4.1.3, seems perfect as it provides excellent compression and the only DML statement it does allow is INSERT. However, does it really work as you would expect?

First of all, it has some serious limitations. Apart from lack of support for DELETE, REPLACE and UPDATE statements (which may be acceptable for some needs), another one is that it does not allow you to have indexes, although you can have an auto_increment column being either a unique or non-unique index. So usually straightforward converting your tables to archive engine will not be possible. See the list of features for reference.

But unfortunately, it does not always work as the manual says, within it’s described limitations. See the following very simple examples.

Problem I

Does the archive storage engine really ensure uniqueness for a primary or unique key?

mysql> CREATE TABLE `b` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> PRIMARY KEY (`id`) -> ) ENGINE=ARCHIVE; Query OK, 0 rows affected (0.01 sec) mysql> insert into b values (null),(null),(null),(null); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> select * from b; +----+ | id | +----+ | 1 | | 2 | | 3 | | 4 | +----+ 4 rows in set (0.01 sec) mysql> repair table b; +--------+--------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------+--------+----------+----------+ | test.b | repair | status | OK | +--------+--------+----------+----------+ 1 row in set (0.00 sec) mysql> insert into b values (null),(null); Query OK, 2 row affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from b; +----+ | id | +----+ | 1 | | 2 | | 3 | | 4 | | 1 | | 2 | +----+ 6 rows in set (0.01 sec) mysql> show indexes from b\G *************************** 1. row *************************** Table: b Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: id Collation: NULL Cardinality: NULL Sub_part: NULL Packed: NULL Null: Index_type: NONE Comment: Index_comment: 1 row in set (0.00 sec)

That is really bad – a column being a primary key effectively allows duplicates! And another case exposing the same problem:

mysql> CREATE TABLE `c` ( `id` int(11) NOT NULL AUTO_INCREMENT, UNIQUE KEY (`id`) ) ENGINE=ARCHIVE; Query OK, 0 rows affected (0.01 sec) mysql> insert into c values (null),(null),(null); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.01 sec) mysql> optimize table c; +--------+----------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------+----------+----------+----------+ | test.c | optimize | status | OK | +--------+----------+----------+----------+ 1 row in set (0.01 sec) mysql> insert into c values (null); Query OK, 1 row affected (0.00 sec) mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | | 1 | +----+ 4 rows in set (0.01 sec)

So even a simple optimize table command does break it completely. After we realize that such operation made our data bad, we won’t be able to easily go back to different engine without sacrificing uniqueness first:

mysql> alter table c engine=innodb; ERROR 1062 (23000): ALTER TABLE causes auto_increment resequencing, resulting in duplicate entry '1' for key 'id' mysql> alter table c drop key id; ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key mysql> alter table c drop key id, add key(id); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> alter table c engine=innodb; Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0

There were already bug reports related to auto_increment feature being broken, but I have filed a new, more specific bug report about this problem.
————–

Problem II

Are we always able to alter a table to use the archive storage engine, even if it is theoretically using supported table definition? Auto increment column issue again…

mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.01 sec)

We have the same c table using archive. We can change it’s engine to something different:

mysql> alter table c engine=innodb; Query OK, 3 rows affected (0.02 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.00 sec)

But in some cases, we can’t set it back to archive!

mysql> alter table c engine=archive; ERROR 1022 (23000): Can't write; duplicate key in table '#sql-1649_3'

There is an old bug report about that.
————–

Problem III

And yet another weirdness around auto_increment values. It seems normal that databases allow us to insert explicit values into auto_increment columns, even lower then last inserted maximum, and all other engines – MyISAM, Memory and InnoDB do that:

mysql> CREATE TABLE ai (a int auto_increment primary key) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) mysql> insert into ai values (10); Query OK, 1 row affected (0.00 sec) mysql> insert into ai values (1); Query OK, 1 row affected (0.00 sec) mysql> select * from ai; +----+ | a | +----+ | 1 | | 10 | +----+ 2 rows in set (0.00 sec)

But it’s not the case for Archive engine:

mysql> CREATE TABLE aa (a int auto_increment primary key) ENGINE=Archive; Query OK, 0 rows affected (0.00 sec) mysql> insert into aa values (10); Query OK, 1 row affected (0.00 sec) mysql> insert into aa values (1); ERROR 1022 (23000): Can't write; duplicate key in table 'aa'

This undocumented behavior was reported here.

Summary

The archive storage engine provides a very good compression and is available in all MySQL variants out of the box. However it does have serious limitations as well as works unreliable and not as expected in some cases.

Related articles worth mentioning here:
http://www.mysqlperformanceblog.com/2006/11/12/trying-archive-storage-engine/
http://www.mysqlperformanceblog.com/2013/02/11/adventures-in-archiving/

In my next blog post, I am going to present simple research on (free) alternatives we do have that can replace the archive storage engine in terms of disk space effectiveness.

The post The ARCHIVE Storage Engine – does it do what you expect? appeared first on MySQL Performance Blog.

Multiple column index vs multiple indexes with MySQL 5.6

January 3, 2014 - 7:10am

A question often comes when talking about indexing: should we use multiple column indexes or multiple indexes on single columns? Peter Zaitsev wrote about it back in 2008 and the conclusion then was that a multiple column index is most often the best solution. But with all the recent optimizer improvements, is there anything different with MySQL 5.6?

Setup

For this test, we will use these 2 tables (same structure as in Peter’s post):

CREATE TABLE t1000merge ( id int not null auto_increment primary key, i int(11) NOT NULL, j int(11) NOT NULL, val char(10) NOT NULL, KEY i (i), KEY j (j) ) ENGINE=InnoDB; CREATE TABLE t1000idx2 ( id int not null auto_increment primary key, i int(11) NOT NULL, j int(11) NOT NULL, val char(10) NOT NULL, KEY ij (ij) ) ENGINE=InnoDB;

Tables were populated with 1M rows for this test, i and j have 1000 distinct values (independent of each other). The buffer pool is large enough to hold all data and indexes.

We will look at this query on MySQL 5.5.35 and MySQL 5.6.15:

SELECT sum(length(val)) FROM T WHERE j=2 AND i BETWEEN 100 and 200

Why this specific query? With MySQL 5.5, for t1000idx2, the optimizer estimates that the index on (i,j) is not selective enough and it falls back to a full table scan. While for t1000merge, the index on (j) is an obvious good candidate to filter efficiently.

Consequently this query has a better response on t1000merge (0.01s) than on t1000idx2 (0.45s).

On MySQL 5.6, this query is a good candidate for index condition pushdown (ICP), so we can reasonably hope that response time for t1000idx2 will improve.

ICP: FORCE INDEX to the rescue

Unfortunately the optimizer still prefers the full table scan which gives us the same bad response time:

mysql5.6> EXPLAIN SELECT sum(length(val)) FROM t1000idx2 WHERE j=2 AND i BETWEEN 100 and 200; +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | t1000idx2 | ALL | ij | NULL | NULL | NULL | 1000545 | Using where | +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+

And what if we use FORCE INDEX?

mysql5.6 > EXPLAIN SELECT sum(length(val)) FROM t1000idx2 FORCE INDEX(ij) WHERE j=2 AND i BETWEEN 100 and 200; +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+ | 1 | SIMPLE | t1000idx2 | range | ij | ij | 8 | NULL | 188460 | Using index condition | +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+

This time ICP is used (see “Using index condition” in the Extra field)!

And the difference in response time is impressive:
- Without FORCE INDEX (full table scan): 0.45s
- With FORCE INDEX (multiple column index + index condition pushdown): 0.04s, a 10x improvement!

Additional thoughts

It is interesting to see that the optimizer fails to find the best execution plan for this simple query. The optimizer trace sheds some light:

mysql> SET optimizer_trace="enabled=on"; mysql> SELECT sum(length(val)) FROM T WHERE j=2 AND i BETWEEN 100 and 200; mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; [...] "range_analysis": { "table_scan": { "rows": 1000545, "cost": 202835 },

This is the estimated cost for a full table scan.
Now we will see how the optimizer estimates the cost of the range scan using the ij index:

[...] "range_scan_alternatives": [ { "index": "ij", "ranges": [ "100 <= i <= 200" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 188460, "cost": 226153, "chosen": false, "cause": "cost" } ] [...]

At this stage the optimizer does not know if ICP can be used. This probably explains why the cost of the range scan is overestimated.

If we look at the optimizer trace for the query with the FORCE INDEX hint, ICP is only detected after the range scan is chosen:

[...] "refine_plan": [ { "table": "`t1000idx2` FORCE INDEX (`ij`)", "pushed_index_condition": "((`t1000idx2`.`j` = 2) and (`t1000idx2`.`i` between 100 and 200))", "table_condition_attached": null, "access_type": "range" } [...]

Conclusion

Multiple column index vs multiple indexes? Having indexes on single columns often lead to the optimizer using the index_merge access type, which is typically not as good as accessing a single index on multiple columns. MySQL 5.6 makes multiple column indexes more efficient than before with index condition pushdown.

But don’t forget that the optimizer is not perfect: you may have to use index hints to benefit from this feature.

The post Multiple column index vs multiple indexes with MySQL 5.6 appeared first on MySQL Performance Blog.

How to recover table structure from .frm files with MySQL Utilities

January 2, 2014 - 12:00am

Table structures are stored in .frm files and in the InnoDB Data Dictionary. Sometimes, usually in data recovery issues, we need to recover those structures to be able to find the lost data or just to recreate the tables.

There are different ways to do it and we’ve already written about it in this blog. For example, we can use the data recovery tools to recover table structures from InnoDB Dictionary or from the .frm files using a MySQL Server. This blog post will be an update of that last one. I will show you how to easily recover the structure from a .frm file and in some cases even without using a MySQL server. This will make the process faster and easily scriptable.

MySQL Utilities and mysqlfrm

MySQL Utilities are a set of scripts released by Oracle that helps us to do some usual DBA tasks in an easier way. It is written in Python and it’s only dependency is the Python Connector. From the large list of utilities we are going to use mysqlfrm, the tool that will help us to recover the structure.

As usual, an image worth a thousand words. Let’s recover some table structures:

This is the table we have:

CREATE TABLE `new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

So, let’s try to recover that info from the .frm file and let’s see what we can get:

$ mysqlfrm --diagnostic /usr/local/mysql/data/test/new_table.frm # WARNING: Cannot generate character set or collation names without the --server option. [...] CREATE TABLE `test`.`new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY `PRIMARY` (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB;

Pretty good result

It is important to mention that this tool has two different ways to do the recovery.

  • First one is spawning a new MySQL instance and run there the structure recovery, pretty similar to the one PeterZ explained in his blog post. You would need to use –server or –basedir directory along with –port. It will shut the spawned instance down after the recovery is complete.
  • The second one used with –diagnostic reads the .frm file byte by byte to recover all the information possible but without the requirement of a MySQL instance. Therefore, this method can be used to recover all the information possible from damaged .frm files that even MySQL can’t read.

As we can see in the warning of the last example not all information can be recovered with second method. For example character set or collation can’t be recovered without the –server option (first method). Let’s see how to use a spawned server to recover the .frm info:

$ mysqlfrm --server=root@127.0.0.1 --port 3307 ./new_table.frm # Source on 127.0.0.1: ... connected. # Starting the spawned server on port 3307 ... done. # Reading .frm files # # Reading the new_table.frm file. # # CREATE statement for ./new_table.frm: # CREATE TABLE `new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

The tool connects to MySQL server, get all the info it needs (basedir and so on) and it spawns a new instance on port 3307. Then it uses that new instance to recover the info from the .frm file. Fast and easy

It is worth to mention that not all the information we could need is stored in those .frm files. There are some things that we won’t be able to recover, for example FK constraints and AI number sequences.

Conclusion

MySQL Utilities is a very useful set of tools. In this particular case mysqlfrm can be used to recover a large list of table structures from their .frm files, making the process fast and easy to script.

The post How to recover table structure from .frm files with MySQL Utilities appeared first on MySQL Performance Blog.

Percona Monitoring Plugins 1.1.1 release

December 30, 2013 - 7:13am

Percona announces the release of Percona Monitoring Plugins 1.1.1 to address the critical bug that appears after 1.1 upgrade.

Changelog:

* Cacti mysql graphs stop working with data input field “server-id” after 1.1 upgrade (bug 1264814)
* Non-integer poller errors for MySQL Query Response Time (bug 1264353)

A new tarball is available from downloads area or RPM and DEB packages from our software repositories. The plugins are fully supported for customers with a Percona Support contract and free installation services are provided as part of some contracts. In addition as part of Percona’s Remote DBA installation and setup of these tools are included with our services. You can find links to the documentation, forums and more at the project homepage.

About Percona Monitoring Plugins

Percona Monitoring Plugins are high-quality components to add enterprise-grade MySQL monitoring and graphing capabilities to your existing in-house, on-premises monitoring solutions. The components are designed to integrate seamlessly with widely deployed solutions such as Nagios and Cacti and are delivered in the form of templates, plugins, and scripts which make it easy to monitor MySQL performance.

The post Percona Monitoring Plugins 1.1.1 release appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>