]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 38 min ago

The use of Iptables ClusterIP target as a load balancer for PXC, PRM, MHA and NDB

January 10, 2014 - 12:00am

Most technologies achieving high-availability for MySQL need a load-balancer to spread the client connections to a valid database host, even the Tungsten special connector can be seen as a sophisticated load-balancer. People often use hardware load balancer or software solution like haproxy. In both cases, in order to avoid having a single point of failure, multiple load balancers must be used. Load balancers have two drawbacks: they increase network latency and/or they add a validation check load on the database servers. The increased network latency is obvious in the case of standalone load balancers where you must first connect to the load balancer which then completes the request by connecting to one of the database servers. Some workloads like reporting/adhoc queries are not affected by a small increase of latency but other workloads like oltp processing and real-time logging are. Each load balancers must also check regularly if the database servers are in a sane state, so adding more load balancers increases the idle chatting over the network. In order to reduce these impacts, a very different type of load balancer is needed, let me introduce the Iptables ClusterIP target.

Normally, as stated by the RFC 1812 Requirements for IP Version 4 Routers an IP address must be unique on a network and each host must respond only for IPs it own. In order to achieve a load balancing behavior, the Iptables ClusterIP target doesn’t strictly respect the RFC. The principle is simple, each computer in the cluster share an IP address and MAC address with the other members but it answers requests only for a given subset, based on the modulo of a network value which is sourceIP-sourcePort by default. The behavior is controlled by an iptables rule and by the content of the kernel file /proc/net/ipt_CLUSTERIP/VIP_ADDRESS. The kernel /proc file just informs the kernel to which portion of the traffic it should answer. I don’t want to go too deep in the details here since all those things are handled by the Pacemaker resource agent, IPaddr2.

The IPaddr2 Pacemaker resource agent is commonly used for VIP but what is less know is its behavior when defined as part of a clone set. When part of clone set, the resource agent defines a VIP which uses the Iptables ClusterIP target, the iptables rules and the handling of the proc file are all done automatically. That seems very nice in theory but until recently, I never succeeded in having a suitable distribution behavior. When starting the clone set on, let’s say, three nodes, it distributes correctly, one instance on each but if 2 nodes fail and then recover, the clone instances all go to the 3rd node and stay there even after the first two nodes recover. That bugged me for quite a while but I finally modified the resource agent and found a way to have it work correctly. It also now set correctly the MAC address if none is provided to the MAC multicast address domain which starts by “01:00:5E”. The new agent, IPaddr3, is available here. Now, let’s show what we can achieve with it.

We’ll start from the setup described in my previous post and we’ll modify it. First, download and install the IPaddr3 agent.

root@pacemaker-1:~# wget -O /usr/lib/ocf/resource.d/percona/IPaddr3 https://github.com/percona/percona-pacemaker-agents/raw/master/agents/IPaddr3 root@pacemaker-1:~# chmod u+x /usr/lib/ocf/resource.d/percona/IPaddr3

Repeat these steps on all 3 nodes. Then, we’ll modify the pacemaker configuration like this (I’ll explain below):

node pacemaker-1 \ attributes standby="off" node pacemaker-2 \ attributes standby="off" node pacemaker-3 \ attributes standby="off" primitive p_cluster_vip ocf:percona:IPaddr3 \ params ip="172.30.212.100" nic="eth1" \ meta resource-stickiness="0" \ op monitor interval="10s" primitive p_mysql_monit ocf:percona:mysql_monitor \ params reader_attribute="readable_monit" writer_attribute="writable_monit" user="repl_user" password="WhatAPassword" pid="/var/lib/mysql/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" max_slave_lag="5" cluster_type="pxc" \ op monitor interval="1s" timeout="30s" OCF_CHECK_LEVEL="1" clone cl_cluster_vip p_cluster_vip \ meta clone-max="3" clone-node-max="3" globally-unique="true" clone cl_mysql_monitor p_mysql_monit \ meta clone-max="3" clone-node-max="1" location loc-distrib-cluster-vip cl_cluster_vip \ rule $id="loc-distrib-cluster-vip-rule" -1: p_cluster_vip_clone_count gt 1 location loc-enable-cluster-vip cl_cluster_vip \ rule $id="loc-enable-cluster-vip-rule" 2: writable_monit eq 1 location loc-no-cluster-vip cl_cluster_vip \ rule $id="loc-no-cluster-vip-rule" -inf: writable_monit eq 0 property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1384275025" \ maintenance-mode="off"

First, the VIP primitive is modified to use the new agent, IPaddr3, and we set resource-stickiness=”0″. Next, we define the cl_cluster_vip clone set using: clone-max=”3″ to have three instances, clone-node-max=”3″ to allow up to three instances on the same node and globally-unique=”true” to tell Pacemaker it has to allocate an instance on a node even if there’s already one. Finally, there’re three location rules needed to get the behavior we want, one using the p_cluster_vip_clone_count attribute and the other two around the writable_monit attribute. Enabling all that gives:

root@pacemaker-1:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:51:38 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-1 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-2 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-3: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1

and the network configuration is:

root@pacemaker-1:~# iptables -L INPUT -n Chain INPUT (policy ACCEPT) target prot opt source destination CLUSTERIP all -- 0.0.0.0/0 172.30.212.100 CLUSTERIP hashmode=sourceip-sourceport clustermac=01:00:5E:91:18:86 total_nodes=3 local_node=1 hash_init=0 root@pacemaker-1:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 2 root@pacemaker-2:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 3 root@pacemaker-3:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 1

In order to test the access, you need to query the VIP from a fourth node:

root@pacemaker-4:~# while [ 1 ]; do mysql -h 172.30.212.100 -u repl_user -pWhatAPassword -BN -e "select variable_value from information_schema.global_variables where variable_name like 'hostname';"; sleep 1; done pacemaker-1 pacemaker-1 pacemaker-2 pacemaker-2 pacemaker-2 pacemaker-3 pacemaker-2 ^C

So, all good… Let’s now desync the pacemaker-1 and pacemaker-2.

root@pacemaker-1:~# mysql -e 'set global wsrep_desync=1;' root@pacemaker-1:~# root@pacemaker-2:~# mysql -e 'set global wsrep_desync=1;' root@pacemaker-2:~# root@pacemaker-3:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:53:51 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-3 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 0 + writable_monit : 0 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 0 + writable_monit : 0 * Node pacemaker-3: + p_cluster_vip_clone_count : 3 + readable_monit : 1 + writable_monit : 1 root@pacemaker-3:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 1,2,3 root@pacemaker-4:~# while [ 1 ]; do mysql -h 172.30.212.100 -u repl_user -pWhatAPassword -BN -e "select variable_value from information_schema.global_variables where variable_name like 'hostname';"; sleep 1; done pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3

Now, if pacemaker-1 and pacemaker-2 are back in sync, we have the desired distribution:

root@pacemaker-1:~# mysql -e 'set global wsrep_desync=0;' root@pacemaker-1:~# root@pacemaker-2:~# mysql -e 'set global wsrep_desync=0;' root@pacemaker-2:~# root@pacemaker-3:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:58:40 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-1 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-2 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-3: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1

All the clone instances redistributed on all nodes as we wanted.

As a conclusion, Pacemaker with a clone set of IPaddr3 is a very interesting kind of load balancer, especially if you already have pacemaker deployed. It introduces almost no latency, it doesn’t need any other hardware, doesn’t increase the database validation load and is as highly-available as your database is. The only drawback I can see is in a case where the inbound traffic is very important. In that case, all nodes are receiving all the traffic and are equally saturated. With databases and web type traffics, the inbound traffic is usually small. This solution also doesn’t redistribute the connections based on the server load like a load balancer can do but that would be fairly easy to implement with something like a server_load attribute and an agent similar to mysql_monitor but that will check the server load instead of the database status. In such a case, I suggest using much more than 1 VIP clone instance per node to have a better granularity in load distribution. Finally, the ClusterIP target, although still fully supported, has been deprecated in favor of the Cluster-match target. It is basically the same principle and I plan to adapt the IPaddr3 agent to Cluster-match in a near future.

The post The use of Iptables ClusterIP target as a load balancer for PXC, PRM, MHA and NDB appeared first on MySQL Performance Blog.

Percona Cloud Tools: Making MySQL performance easy

January 9, 2014 - 7:27am

One of our primary focuses at Percona is performance. Let me make some statements on what is “performance.”

In doing so I will refer to two pieces of content:

I highly recommend that you familiarize yourself with both of them.

Performance

Performance is about tasks and time.
We say that the system is performing well if it executes a task in an acceptable period of time, or that the system executes the required amounts of tasks per given period (sec, minute, hour etc).

To put it in our area of interest, what is a task for a database? Often we consider this as an individual query, and this is good enough detalization. One may think about database task as a “transaction” or set of queries executed during web page access, but I propose we focus on queries.

From this prospective, the database (MySQL) shows good performance if it executes the query in a reasonable amount of time (this is what we know as Response Time), or executes the expected amount of queries per second (hour/day) (this is Throughput).

So we connected a query execution with performance.
And let me state this: most database performance problems are related to queries. Either there are too many queries to handle (the system is overloaded; we ask to perform more tasks then the system is able to handle); or a query takes a long time to execute (the system is not handling the task in optimal way). Of course there could be another reasons, like hardware failure, database misconfiguration, OS issues etc, but, again, in most cases it is Queries what is problem.

“Performance is HARD”

Now, in [1] Carry Millsap states that “Performance is HARD”, and it is true, if you do not know where to look.
Very often people look into some funny places like cache hit ratio (“You have only 93% cache hit ratio, that your problem!”), amount of IO per second, CPU utilization, aborted_clients, handler_select you name it. Looking at these numbers with some experience and black magic may give you some results, or may not. When you look at these numbers, make some conclusions, then make some changes in hope it helps – it is a performance tuning, not performance optimization [1].

Fortunately, Performance is EASY if you know what to look into [1].
With Percona Clould Tools we make Performance EASY – our tools show you what you should be looking for.

Right now we have one module: Query Analytics, which is focused on queries. Query Analytics is based on our command line tool pt-query-digest from Percona Toolkit and we implement our best practices in how we analyze queries working with customers systems. Peter actually names this tool “Peter in the box”.

Percona Cloud Tools / Query Analytics

Let me review what information Query Analytics provides…

1. General Query Profile
Which queries takes the most time in the database, and information on how many queries execute, total time they took, average time per query and 95% response time.

Let’s see an example for the MySQLPerformanceBlog.com for one week time frame. This blog is WordPress based, so the queries are coming from WordPress:

The two top queries are related to users:
SELECT wp_usermeta and SELECT wp_users, and we can see that both of them are executed 69M times for the last week. Though 95% response time for first one is 277.00µs, which is just fine.

There we can click and see more details about query:

Query Text and

Metrics

…which shows that execution time for the query was from (Min) 5.00µs to (Max) 529.74ms
Bunch of metrics like InnoDB* are available only of you are using Percona Server.

What is more interesting is that Query Analitycs allows us to see historical data.
If we switch to graph we can see the data for the last week:

Where we clearly see a spike in amount of queries. This corresponds to the published post: Multiple column index vs multiple indexes with MySQL 5.6.

And there we can switch between Total and 95% graph,

…which shows that even with the spike, the 95% response time of this query was not affected and that means MySQL (Percona Server) was able to handle the load.

Query Analytics helps you:

  • Be sure your queries (tasks) are performed as expected
  • Queries are executed within the time frame you need
  • If not, then you will be able to see which query (or several) are problematic and requires attention
  • Get detailed metrics for each query

Percona Cloud Tools is to make MySQL Performance Easy!

You can (and should) sign up for free Beta and try the tool yourself.

The post Percona Cloud Tools: Making MySQL performance easy appeared first on MySQL Performance Blog.

Finding a good IST donor in Percona XtraDB Cluster 5.6

January 8, 2014 - 3:00am
Gcache and IST

The Gcache is a memory-based cache of recent Galera transactions that is local to each node in a cluster.  If a node leaves and rejoins the cluster, it can use the gcache from another node that stayed in the cluster (i.e., its donor node) to fetch the transactions it missed (IST) as opposed to doing a full state snapshot transfer (SST).  However, there are a few nuances that are not obvious to the beginner:

  • The Gcache is lost when a node restarts
  • The Gcache is fixed size and implemented as a LRU.  Once it is full, older transactions roll off.
  • Donor selection is made irregardless of the gcache state
  • If the given donor for a restarting node doesn’t have all transactions needed, a full SST (read: full backup) is done instead
  • Until recent developments, there was no way to tell what, precisely, was in the Gcache.

So, with (somewhat) arbitrary donor selection, it was hard to be certain that a node restart would not trigger a SST.  For example:

  • A node crashed over night or was otherwise down for some length of time.  How do you know if the gcache on any node is big enough to contain all the transactions necessary for IST?
  • If you brought two nodes in your cluster simultaneously, the second one you restart might select the first one as its donor and be forced to SST.
Along comes PXC 5.6.15 RC1

Astute readers of the PXC 5.6.15 release notes will have noticed this little tidbit:

New wsrep_local_cached_downto status variable has been introduced. This variable shows the lowest sequence number in gcache. This information can be helpful with determining IST and/or SST.

Until this release there was no visibility into any node’s Gcache and what was likely to happen when restarting a node.  You could make some assumptions, but now it its a bit easier to:

  1. Tell if a given node would be a suitable donor
  2. And hence select a donor manually using wsrep_sst_donor instead of leaving it to chance.

 

What it looks like

Suppose I have a 3 node cluster where load is hitting node1.  I execute the following in sequence:

  1. Shut down node2
  2. Shut down node3
  3. Restart node2

At step 3, node1 is the only viable donor for node2.  Because our restart was quick, we can have some reasonable assurance that node2 will IST correctly (and it does).

However, before we restart node3, let’s check the oldest transaction in the gcache on nodes 1 and 2:

[root@node1 ~]# mysql -e "show global status like 'wsrep_local_cached_downto';" +---------------------------+--------+ | Variable_name | Value | +---------------------------+--------+ | wsrep_local_cached_downto | 889703 | +---------------------------+--------+ [root@node2 mysql]# mysql -e "show global status like 'wsrep_local_cached_downto';" +---------------------------+---------+ | Variable_name | Value | +---------------------------+---------+ | wsrep_local_cached_downto | 1050151 | +---------------------------+---------+

So we can see that node1 has a much more “complete” gcache than node2 does (i.e., a much smaller seqno). Node2′s gcache was wiped when it restarted, so it only has transactions from after its restart.

To check node3′s GTID, we can either check the grastate.dat, or (if it has crashed and the grastate is zeroed) use –wsrep_recover:

[root@node3 ~]# cat /var/lib/mysql/grastate.dat # GALERA saved state version: 2.1 uuid: 7206c8e4-7705-11e3-b175-922feecc92a0 seqno: 1039191 cert_index: [root@node3 ~]# mysqld_safe --wsrep-recover 140107 16:18:37 mysqld_safe Logging to '/var/lib/mysql/error.log'. 140107 16:18:37 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 140107 16:18:37 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.pIVkT4' --pid-file='/var/lib/mysql/node3-recover.pid' 140107 16:18:39 mysqld_safe WSREP: Recovered position 7206c8e4-7705-11e3-b175-922feecc92a0:1039191 140107 16:18:41 mysqld_safe mysqld from pid file /var/lib/mysql/node3.pid ended

So, armed with this information, we can tell what would happen to node3, depending on which donor was selected:

Donor selectedDonor’s gcache oldest seqnoNode3′s seqnoResult for node3node210501511039191SSTnode18897031039191IST

So, we can instruct node3 to use node1 as its donor on restart with wsrep_sst_donor:

[root@node3 ~]# service mysql start --wsrep_sst_donor=node1

Note that passing mysqld options on the command line is only supported in RPM packages, Debian requires you put that setting in your my.cnf.  We can see from node3′s log that it does properly IST:

2014-01-07 16:23:26 19834 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.70.4:4568 2014-01-07 16:23:26 19834 [Note] WSREP: Node 0.0 (node3) requested state transfer from 'node1'. Selected 2.0 (node1)(SYNCED) as donor. ... 2014-01-07 16:23:27 19834 [Note] WSREP: Receiving IST: 39359 writesets, seqnos 1039191-1078550 2014-01-07 16:23:27 19834 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.15-56' socket: '/var/lib/mysql/mysql.sock' port: 3306 Percona XtraDB Cluster (GPL), Release 25.2, Revision 645, wsrep_25.2.r4027 2014-01-07 16:23:41 19834 [Note] WSREP: IST received: 7206c8e4-7705-11e3-b175-922feecc92a0:1078550

Sometime in the future, this may be handled automatically on donor selection, but for now it is very useful that we can at least see the status of the gcache.

The post Finding a good IST donor in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Increasing slow query performance with the parallel query execution

January 7, 2014 - 12:00am

MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more  CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only 1 CPU core (no parallelism). Here is what I mean by that: let’s say we have a complex query which will need to scan million of rows and may need to create a temporary table; in this case MySQL will not be able to scan the table in multiple threads (even with partitioning) so the single query will not be faster on the more powerful server. On the contrary, a server with more slower CPUs will show worse performance than the server with less (but faster) CPUs.

To address this issue we can use a parallel query execution. Vadim wrote about the PHP asynchronous calls for MySQL. Another way to increase the parallelism will be to use “sharding” approach, for  example with Shard Query. I’ve decided to test out the parallel (asynchronous) query execution with relatively large table: I’ve used the US Flights Ontime performance database, which was originally used by Vadim in the old post Analyzing air traffic performance. Let’s see how this can help us increase performance of the complex query reports.

Parallel Query Example

To illustrate the parallel query execution with MySQL I’ve created the following table:

CREATE TABLE `ontime` ( `YearD` year(4) NOT NULL, `Quarter` tinyint(4) DEFAULT NULL, `MonthD` tinyint(4) DEFAULT NULL, `DayofMonth` tinyint(4) DEFAULT NULL, `DayOfWeek` tinyint(4) DEFAULT NULL, `FlightDate` date DEFAULT NULL, `UniqueCarrier` char(7) DEFAULT NULL, `AirlineID` int(11) DEFAULT NULL, `Carrier` char(2) DEFAULT NULL, `TailNum` varchar(50) DEFAULT NULL, `FlightNum` varchar(10) DEFAULT NULL, `OriginAirportID` int(11) DEFAULT NULL, `OriginAirportSeqID` int(11) DEFAULT NULL, `OriginCityMarketID` int(11) DEFAULT NULL, `Origin` char(5) DEFAULT NULL, `OriginCityName` varchar(100) DEFAULT NULL, `OriginState` char(2) DEFAULT NULL, `OriginStateFips` varchar(10) DEFAULT NULL, `OriginStateName` varchar(100) DEFAULT NULL, `OriginWac` int(11) DEFAULT NULL, `DestAirportID` int(11) DEFAULT NULL, `DestAirportSeqID` int(11) DEFAULT NULL, `DestCityMarketID` int(11) DEFAULT NULL, `Dest` char(5) DEFAULT NULL, -- ... (removed number of fields) `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`), KEY `YearD` (`YearD`), KEY `Carrier` (`Carrier`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And loaded 26 years of data into it. The table is 56G with ~152M rows.

Software: Percona 5.6.15-63.0. Hardware: Supermicro; X8DTG-D; 48G of RAM; 24xIntel(R) Xeon(R) CPU L5639 @ 2.13GHz, 1xSSD drive (250G)

So we have 24 relatively slow CPUs

Simple query

Now we can run some queries. The first query is very simple: find all flights per year (in the US):

select yeard, count(*) from ontime group by yeard

As we have the index on YearD, the query will use the index:

mysql> explain select yeard, count(*) from ontime group by yeard\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: ontime type: index possible_keys: YearD,comb1 key: YearD key_len: 1 ref: NULL rows: 148046200 Extra: Using index 1 row in set (0.00 sec)

The query is simple, however, it will have to scan 150M rows. Here is the results of the query (cached):

mysql> select yeard, count(*) from ontime group by yeard; +-------+----------+ | yeard | count(*) | +-------+----------+ | 1988 | 5202096 | | 1989 | 5041200 | | 1990 | 5270893 | | 1991 | 5076925 | | 1992 | 5092157 | | 1993 | 5070501 | | 1994 | 5180048 | | 1995 | 5327435 | | 1996 | 5351983 | | 1997 | 5411843 | | 1998 | 5384721 | | 1999 | 5527884 | | 2000 | 5683047 | | 2001 | 5967780 | | 2002 | 5271359 | | 2003 | 6488540 | | 2004 | 7129270 | | 2005 | 7140596 | | 2006 | 7141922 | | 2007 | 7455458 | | 2008 | 7009726 | | 2009 | 6450285 | | 2010 | 6450117 | | 2011 | 6085281 | | 2012 | 6096762 | | 2013 | 5349447 | +-------+----------+ 26 rows in set (54.10 sec)

The query took 54 seconds and utilized only 1 CPU core. However, this query is perfect for running in parallel.  We can run 26 parallel queries, each will count its own year. I’ve used the following shell script to run the queries in background:

#!/bin/bash date for y in {1988..2013} do sql="select yeard, count(*) from ontime where yeard=$y" mysql -vvv ontime -e "$sql" &>par_sql1/$y.log & done wait date

Here are the results:

Start: 11:41:21 EST 2014 End: 11:41:26 EST 2014

So the total execution time is ~5 (10x faster) seconds. Each individual results are here:

par_sql1/1988.log:1 row in set (3.70 sec) par_sql1/1989.log:1 row in set (4.08 sec) par_sql1/1990.log:1 row in set (4.59 sec) par_sql1/1991.log:1 row in set (4.26 sec) par_sql1/1992.log:1 row in set (4.54 sec) par_sql1/1993.log:1 row in set (2.78 sec) par_sql1/1994.log:1 row in set (3.41 sec) par_sql1/1995.log:1 row in set (4.87 sec) par_sql1/1996.log:1 row in set (4.41 sec) par_sql1/1997.log:1 row in set (3.69 sec) par_sql1/1998.log:1 row in set (3.56 sec) par_sql1/1999.log:1 row in set (4.47 sec) par_sql1/2000.log:1 row in set (4.71 sec) par_sql1/2001.log:1 row in set (4.81 sec) par_sql1/2002.log:1 row in set (4.19 sec) par_sql1/2003.log:1 row in set (4.04 sec) par_sql1/2004.log:1 row in set (5.12 sec) par_sql1/2005.log:1 row in set (5.10 sec) par_sql1/2006.log:1 row in set (4.93 sec) par_sql1/2007.log:1 row in set (5.29 sec) par_sql1/2008.log:1 row in set (5.59 sec) par_sql1/2009.log:1 row in set (4.44 sec) par_sql1/2010.log:1 row in set (4.91 sec) par_sql1/2011.log:1 row in set (5.08 sec) par_sql1/2012.log:1 row in set (4.85 sec) par_sql1/2013.log:1 row in set (4.56 sec)

Complex Query 

Now we can try more complex query. Lets imagine we want to find out which airlines have maximum delays for the flights inside continental US during the business days from 1988 to 2009 (I was trying to come up with the complex query with multiple conditions in the where clause).

select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' GROUP by carrier HAVING cnt > 100000 and max(yeard) > 1990 ORDER by rate DESC

As the query has “group by” and “order by” plus multiple ranges in the where clause it will have to create a temporary table:

id: 1 select_type: SIMPLE table: ontime type: index possible_keys: comb1 key: comb1 key_len: 9 ref: NULL rows: 148046200 Extra: Using where; Using temporary; Using filesort

(for this query I’ve created the combined index: KEY comb1 (Carrier,YearD,ArrDelayMinutes)  to increase performance)

The query runs in ~15 minutes:

+------------+------------+---------+----------+-----------------+------+ | min(yeard) | max(yeard) | Carrier | cnt | flights_delayed | rate | +------------+------------+---------+----------+-----------------+------+ | 2003 | 2009 | EV | 1454777 | 237698 | 0.16 | | 2006 | 2009 | XE | 1016010 | 152431 | 0.15 | | 2006 | 2009 | YV | 740608 | 110389 | 0.15 | | 2003 | 2009 | B6 | 683874 | 103677 | 0.15 | | 2003 | 2009 | FL | 1082489 | 158748 | 0.15 | | 2003 | 2005 | DH | 501056 | 69833 | 0.14 | | 2001 | 2009 | MQ | 3238137 | 448037 | 0.14 | | 2003 | 2006 | RU | 1007248 | 126733 | 0.13 | | 2004 | 2009 | OH | 1195868 | 160071 | 0.13 | | 2003 | 2006 | TZ | 136735 | 16496 | 0.12 | | 1988 | 2009 | UA | 9593284 | 1197053 | 0.12 | | 1988 | 2009 | AA | 10600509 | 1185343 | 0.11 | | 1988 | 2001 | TW | 2659963 | 280741 | 0.11 | | 1988 | 2009 | CO | 6029149 | 673863 | 0.11 | | 2007 | 2009 | 9E | 577244 | 59440 | 0.10 | | 1988 | 2009 | DL | 11869471 | 1156267 | 0.10 | | 1988 | 2009 | NW | 7601727 | 725460 | 0.10 | | 1988 | 2009 | AS | 1506003 | 146920 | 0.10 | | 2003 | 2009 | OO | 2654259 | 257069 | 0.10 | | 1988 | 2009 | US | 10276941 | 991016 | 0.10 | | 1988 | 1991 | PA | 206841 | 19465 | 0.09 | | 1988 | 2005 | HP | 2607603 | 235675 | 0.09 | | 1988 | 2009 | WN | 12722174 | 1107840 | 0.09 | | 2005 | 2009 | F9 | 307569 | 28679 | 0.09 | +------------+------------+---------+----------+-----------------+------+ 24 rows in set (15 min 56.40 sec)

 

Now we can split this query and run the 31 queries (=31 distinct airlines in this table) in parallel. I have used the following script:

date for c in '9E' 'AA' 'AL' 'AQ' 'AS' 'B6' 'CO' 'DH' 'DL' 'EA' 'EV' 'F9' 'FL' 'HA' 'HP' 'ML' 'MQ' 'NW' 'OH' 'OO' 'PA' 'PI' 'PS' 'RU' 'TW' 'TZ' 'UA' 'US' 'WN' 'XE' 'YV' do sql=" select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' and carrier = '$c'" mysql -uroot -vvv ontime -e "$sql" &>par_sql_complex/$c.log & done wait date

In this case we will also avoid creating temporary table  (as we have an index which starts with carrier).

Results: total time is 5 min 47 seconds (3x faster)

Start: 15:41:02 EST 2013 End: 15:46:49 EST 2013

Per query statistics:

par_sql_complex/9E.log:1 row in set (44.47 sec) par_sql_complex/AA.log:1 row in set (5 min 41.13 sec) par_sql_complex/AL.log:1 row in set (15.81 sec) par_sql_complex/AQ.log:1 row in set (14.52 sec) par_sql_complex/AS.log:1 row in set (2 min 43.01 sec) par_sql_complex/B6.log:1 row in set (1 min 26.06 sec) par_sql_complex/CO.log:1 row in set (3 min 58.07 sec) par_sql_complex/DH.log:1 row in set (31.30 sec) par_sql_complex/DL.log:1 row in set (5 min 47.07 sec) par_sql_complex/EA.log:1 row in set (28.58 sec) par_sql_complex/EV.log:1 row in set (2 min 6.87 sec) par_sql_complex/F9.log:1 row in set (46.18 sec) par_sql_complex/FL.log:1 row in set (1 min 30.83 sec) par_sql_complex/HA.log:1 row in set (39.42 sec) par_sql_complex/HP.log:1 row in set (2 min 45.57 sec) par_sql_complex/ML.log:1 row in set (4.64 sec) par_sql_complex/MQ.log:1 row in set (2 min 22.55 sec) par_sql_complex/NW.log:1 row in set (4 min 26.67 sec) par_sql_complex/OH.log:1 row in set (1 min 9.67 sec) par_sql_complex/OO.log:1 row in set (2 min 14.97 sec) par_sql_complex/PA.log:1 row in set (17.62 sec) par_sql_complex/PI.log:1 row in set (14.52 sec) par_sql_complex/PS.log:1 row in set (3.46 sec) par_sql_complex/RU.log:1 row in set (40.14 sec) par_sql_complex/TW.log:1 row in set (2 min 32.32 sec) par_sql_complex/TZ.log:1 row in set (14.16 sec) par_sql_complex/UA.log:1 row in set (4 min 55.18 sec) par_sql_complex/US.log:1 row in set (4 min 38.08 sec) par_sql_complex/WN.log:1 row in set (4 min 56.12 sec) par_sql_complex/XE.log:1 row in set (24.21 sec) par_sql_complex/YV.log:1 row in set (20.82 sec)

As we can see there are large airlines (like AA, UA, US, DL, etc) which took most of the time. In this case the load will not be distributed evenly as in the previous example; however, by running the query in parallel we have got 3x times better response time on this server.

CPU utilization:

Cpu3 : 22.0%us, 1.2%sy, 0.0%ni, 74.4%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 16.0%us, 0.0%sy, 0.0%ni, 84.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 39.0%us, 1.2%sy, 0.0%ni, 56.1%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 33.3%us, 0.0%sy, 0.0%ni, 51.9%id, 13.6%wa, 0.0%hi, 1.2%si, 0.0%st Cpu7 : 33.3%us, 1.2%sy, 0.0%ni, 48.8%id, 16.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 24.7%us, 0.0%sy, 0.0%ni, 60.5%id, 14.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 24.4%us, 0.0%sy, 0.0%ni, 56.1%id, 19.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 40.7%us, 0.0%sy, 0.0%ni, 56.8%id, 2.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 19.5%us, 1.2%sy, 0.0%ni, 65.9%id, 12.2%wa, 0.0%hi, 1.2%si, 0.0%st Cpu12 : 40.2%us, 1.2%sy, 0.0%ni, 56.1%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 82.7%us, 0.0%sy, 0.0%ni, 17.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 55.4%us, 0.0%sy, 0.0%ni, 43.4%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 86.6%us, 0.0%sy, 0.0%ni, 13.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 61.0%us, 1.2%sy, 0.0%ni, 37.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 29.3%us, 1.2%sy, 0.0%ni, 69.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 18.8%us, 0.0%sy, 0.0%ni, 52.5%id, 28.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 14.3%us, 1.2%sy, 0.0%ni, 57.1%id, 27.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 12.3%us, 0.0%sy, 0.0%ni, 59.3%id, 28.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 10.7%us, 0.0%sy, 0.0%ni, 76.2%id, 11.9%wa, 0.0%hi, 1.2%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 10.8%us, 2.4%sy, 0.0%ni, 71.1%id, 15.7%wa, 0.0%hi, 0.0%si, 0.0%st

Note that in case of “order by” we will need to manually sort the results, however, sorting 10-100 rows will be fast.

Conclusion

Splitting a complex report into multiple queries and running it in parallel (asynchronously) can increase performance (3x to 10x in the above example) and will better utilize modern hardware. It is also possible to split the queries between multiple MySQL servers (i.e. MySQL slave servers) to further increase scalability (will require more coding).

The post Increasing slow query performance with the parallel query execution appeared first on MySQL Performance Blog.

The ARCHIVE Storage Engine – does it do what you expect?

January 6, 2014 - 12:00am

Sometimes there is a need for keeping large amounts of old, rarely used data without investing too much on expensive storage. Very often such data doesn’t need to be updated anymore, or the intent is to leave it untouched. I sometimes wonder what I should really suggest to our Support customers.

For this purpose, the archive storage engine, added in MySQL 4.1.3, seems perfect as it provides excellent compression and the only DML statement it does allow is INSERT. However, does it really work as you would expect?

First of all, it has some serious limitations. Apart from lack of support for DELETE, REPLACE and UPDATE statements (which may be acceptable for some needs), another one is that it does not allow you to have indexes, although you can have an auto_increment column being either a unique or non-unique index. So usually straightforward converting your tables to archive engine will not be possible. See the list of features for reference.

But unfortunately, it does not always work as the manual says, within it’s described limitations. See the following very simple examples.

Problem I

Does the archive storage engine really ensure uniqueness for a primary or unique key?

mysql> CREATE TABLE `b` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> PRIMARY KEY (`id`) -> ) ENGINE=ARCHIVE; Query OK, 0 rows affected (0.01 sec) mysql> insert into b values (null),(null),(null),(null); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> select * from b; +----+ | id | +----+ | 1 | | 2 | | 3 | | 4 | +----+ 4 rows in set (0.01 sec) mysql> repair table b; +--------+--------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------+--------+----------+----------+ | test.b | repair | status | OK | +--------+--------+----------+----------+ 1 row in set (0.00 sec) mysql> insert into b values (null),(null); Query OK, 2 row affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from b; +----+ | id | +----+ | 1 | | 2 | | 3 | | 4 | | 1 | | 2 | +----+ 6 rows in set (0.01 sec) mysql> show indexes from b\G *************************** 1. row *************************** Table: b Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: id Collation: NULL Cardinality: NULL Sub_part: NULL Packed: NULL Null: Index_type: NONE Comment: Index_comment: 1 row in set (0.00 sec)

That is really bad – a column being a primary key effectively allows duplicates! And another case exposing the same problem:

mysql> CREATE TABLE `c` ( `id` int(11) NOT NULL AUTO_INCREMENT, UNIQUE KEY (`id`) ) ENGINE=ARCHIVE; Query OK, 0 rows affected (0.01 sec) mysql> insert into c values (null),(null),(null); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.01 sec) mysql> optimize table c; +--------+----------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------+----------+----------+----------+ | test.c | optimize | status | OK | +--------+----------+----------+----------+ 1 row in set (0.01 sec) mysql> insert into c values (null); Query OK, 1 row affected (0.00 sec) mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | | 1 | +----+ 4 rows in set (0.01 sec)

So even a simple optimize table command does break it completely. After we realize that such operation made our data bad, we won’t be able to easily go back to different engine without sacrificing uniqueness first:

mysql> alter table c engine=innodb; ERROR 1062 (23000): ALTER TABLE causes auto_increment resequencing, resulting in duplicate entry '1' for key 'id' mysql> alter table c drop key id; ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key mysql> alter table c drop key id, add key(id); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> alter table c engine=innodb; Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0

There were already bug reports related to auto_increment feature being broken, but I have filed a new, more specific bug report about this problem.
————–

Problem II

Are we always able to alter a table to use the archive storage engine, even if it is theoretically using supported table definition? Auto increment column issue again…

mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.01 sec)

We have the same c table using archive. We can change it’s engine to something different:

mysql> alter table c engine=innodb; Query OK, 3 rows affected (0.02 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select * from c; +----+ | id | +----+ | 1 | | 2 | | 3 | +----+ 3 rows in set (0.00 sec)

But in some cases, we can’t set it back to archive!

mysql> alter table c engine=archive; ERROR 1022 (23000): Can't write; duplicate key in table '#sql-1649_3'

There is an old bug report about that.
————–

Problem III

And yet another weirdness around auto_increment values. It seems normal that databases allow us to insert explicit values into auto_increment columns, even lower then last inserted maximum, and all other engines – MyISAM, Memory and InnoDB do that:

mysql> CREATE TABLE ai (a int auto_increment primary key) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) mysql> insert into ai values (10); Query OK, 1 row affected (0.00 sec) mysql> insert into ai values (1); Query OK, 1 row affected (0.00 sec) mysql> select * from ai; +----+ | a | +----+ | 1 | | 10 | +----+ 2 rows in set (0.00 sec)

But it’s not the case for Archive engine:

mysql> CREATE TABLE aa (a int auto_increment primary key) ENGINE=Archive; Query OK, 0 rows affected (0.00 sec) mysql> insert into aa values (10); Query OK, 1 row affected (0.00 sec) mysql> insert into aa values (1); ERROR 1022 (23000): Can't write; duplicate key in table 'aa'

This undocumented behavior was reported here.

Summary

The archive storage engine provides a very good compression and is available in all MySQL variants out of the box. However it does have serious limitations as well as works unreliable and not as expected in some cases.

Related articles worth mentioning here:
http://www.mysqlperformanceblog.com/2006/11/12/trying-archive-storage-engine/
http://www.mysqlperformanceblog.com/2013/02/11/adventures-in-archiving/

In my next blog post, I am going to present simple research on (free) alternatives we do have that can replace the archive storage engine in terms of disk space effectiveness.

The post The ARCHIVE Storage Engine – does it do what you expect? appeared first on MySQL Performance Blog.

Multiple column index vs multiple indexes with MySQL 5.6

January 3, 2014 - 7:10am

A question often comes when talking about indexing: should we use multiple column indexes or multiple indexes on single columns? Peter Zaitsev wrote about it back in 2008 and the conclusion then was that a multiple column index is most often the best solution. But with all the recent optimizer improvements, is there anything different with MySQL 5.6?

Setup

For this test, we will use these 2 tables (same structure as in Peter’s post):

CREATE TABLE t1000merge ( id int not null auto_increment primary key, i int(11) NOT NULL, j int(11) NOT NULL, val char(10) NOT NULL, KEY i (i), KEY j (j) ) ENGINE=InnoDB; CREATE TABLE t1000idx2 ( id int not null auto_increment primary key, i int(11) NOT NULL, j int(11) NOT NULL, val char(10) NOT NULL, KEY ij (ij) ) ENGINE=InnoDB;

Tables were populated with 1M rows for this test, i and j have 1000 distinct values (independent of each other). The buffer pool is large enough to hold all data and indexes.

We will look at this query on MySQL 5.5.35 and MySQL 5.6.15:

SELECT sum(length(val)) FROM T WHERE j=2 AND i BETWEEN 100 and 200

Why this specific query? With MySQL 5.5, for t1000idx2, the optimizer estimates that the index on (i,j) is not selective enough and it falls back to a full table scan. While for t1000merge, the index on (j) is an obvious good candidate to filter efficiently.

Consequently this query has a better response on t1000merge (0.01s) than on t1000idx2 (0.45s).

On MySQL 5.6, this query is a good candidate for index condition pushdown (ICP), so we can reasonably hope that response time for t1000idx2 will improve.

ICP: FORCE INDEX to the rescue

Unfortunately the optimizer still prefers the full table scan which gives us the same bad response time:

mysql5.6> EXPLAIN SELECT sum(length(val)) FROM t1000idx2 WHERE j=2 AND i BETWEEN 100 and 200; +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | t1000idx2 | ALL | ij | NULL | NULL | NULL | 1000545 | Using where | +----+-------------+-----------+------+---------------+------+---------+------+---------+-------------+

And what if we use FORCE INDEX?

mysql5.6 > EXPLAIN SELECT sum(length(val)) FROM t1000idx2 FORCE INDEX(ij) WHERE j=2 AND i BETWEEN 100 and 200; +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+ | 1 | SIMPLE | t1000idx2 | range | ij | ij | 8 | NULL | 188460 | Using index condition | +----+-------------+-----------+-------+---------------+------+---------+------+--------+-----------------------+

This time ICP is used (see “Using index condition” in the Extra field)!

And the difference in response time is impressive:
- Without FORCE INDEX (full table scan): 0.45s
- With FORCE INDEX (multiple column index + index condition pushdown): 0.04s, a 10x improvement!

Additional thoughts

It is interesting to see that the optimizer fails to find the best execution plan for this simple query. The optimizer trace sheds some light:

mysql> SET optimizer_trace="enabled=on"; mysql> SELECT sum(length(val)) FROM T WHERE j=2 AND i BETWEEN 100 and 200; mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; [...] "range_analysis": { "table_scan": { "rows": 1000545, "cost": 202835 },

This is the estimated cost for a full table scan.
Now we will see how the optimizer estimates the cost of the range scan using the ij index:

[...] "range_scan_alternatives": [ { "index": "ij", "ranges": [ "100 <= i <= 200" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 188460, "cost": 226153, "chosen": false, "cause": "cost" } ] [...]

At this stage the optimizer does not know if ICP can be used. This probably explains why the cost of the range scan is overestimated.

If we look at the optimizer trace for the query with the FORCE INDEX hint, ICP is only detected after the range scan is chosen:

[...] "refine_plan": [ { "table": "`t1000idx2` FORCE INDEX (`ij`)", "pushed_index_condition": "((`t1000idx2`.`j` = 2) and (`t1000idx2`.`i` between 100 and 200))", "table_condition_attached": null, "access_type": "range" } [...]

Conclusion

Multiple column index vs multiple indexes? Having indexes on single columns often lead to the optimizer using the index_merge access type, which is typically not as good as accessing a single index on multiple columns. MySQL 5.6 makes multiple column indexes more efficient than before with index condition pushdown.

But don’t forget that the optimizer is not perfect: you may have to use index hints to benefit from this feature.

The post Multiple column index vs multiple indexes with MySQL 5.6 appeared first on MySQL Performance Blog.

How to recover table structure from .frm files with MySQL Utilities

January 2, 2014 - 12:00am

Table structures are stored in .frm files and in the InnoDB Data Dictionary. Sometimes, usually in data recovery issues, we need to recover those structures to be able to find the lost data or just to recreate the tables.

There are different ways to do it and we’ve already written about it in this blog. For example, we can use the data recovery tools to recover table structures from InnoDB Dictionary or from the .frm files using a MySQL Server. This blog post will be an update of that last one. I will show you how to easily recover the structure from a .frm file and in some cases even without using a MySQL server. This will make the process faster and easily scriptable.

MySQL Utilities and mysqlfrm

MySQL Utilities are a set of scripts released by Oracle that helps us to do some usual DBA tasks in an easier way. It is written in Python and it’s only dependency is the Python Connector. From the large list of utilities we are going to use mysqlfrm, the tool that will help us to recover the structure.

As usual, an image worth a thousand words. Let’s recover some table structures:

This is the table we have:

CREATE TABLE `new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

So, let’s try to recover that info from the .frm file and let’s see what we can get:

$ mysqlfrm --diagnostic /usr/local/mysql/data/test/new_table.frm # WARNING: Cannot generate character set or collation names without the --server option. [...] CREATE TABLE `test`.`new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY `PRIMARY` (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB;

Pretty good result

It is important to mention that this tool has two different ways to do the recovery.

  • First one is spawning a new MySQL instance and run there the structure recovery, pretty similar to the one PeterZ explained in his blog post. You would need to use –server or –basedir directory along with –port. It will shut the spawned instance down after the recovery is complete.
  • The second one used with –diagnostic reads the .frm file byte by byte to recover all the information possible but without the requirement of a MySQL instance. Therefore, this method can be used to recover all the information possible from damaged .frm files that even MySQL can’t read.

As we can see in the warning of the last example not all information can be recovered with second method. For example character set or collation can’t be recovered without the –server option (first method). Let’s see how to use a spawned server to recover the .frm info:

$ mysqlfrm --server=root@127.0.0.1 --port 3307 ./new_table.frm # Source on 127.0.0.1: ... connected. # Starting the spawned server on port 3307 ... done. # Reading .frm files # # Reading the new_table.frm file. # # CREATE statement for ./new_table.frm: # CREATE TABLE `new_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` tinyint(4) NOT NULL, PRIMARY KEY (`id`), KEY `name_idx` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

The tool connects to MySQL server, get all the info it needs (basedir and so on) and it spawns a new instance on port 3307. Then it uses that new instance to recover the info from the .frm file. Fast and easy

It is worth to mention that not all the information we could need is stored in those .frm files. There are some things that we won’t be able to recover, for example FK constraints and AI number sequences.

Conclusion

MySQL Utilities is a very useful set of tools. In this particular case mysqlfrm can be used to recover a large list of table structures from their .frm files, making the process fast and easy to script.

The post How to recover table structure from .frm files with MySQL Utilities appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>