In the “MySQL Query tuning 101” video, Alexander Rubin provides an excellent example of when to use a covered index. On slide 25, he takes the query select name from City where CountryCode = ’USA’ and District = ’Alaska’ and population > 10000 and adds the index cov1(CountryCode, District, population, name) on table City. With Alex’s query tuning experience, making the right index decision is simple – but what about us mere mortals? If a query is more complicated, or simply uses more than one table, how do we know what to do? Maintaining another index can slow down INSERT statements, so you need to be very careful when choosing one. Examining the array “used_columns” could help out.
Let’s assume a more complicated version of the query was used in “MySQL Query tuning 101”:
1 2 3 4 5 | select City.name as city, Country.name as country, group_concat(Language) from City join CountryLanguage using(CountryCode) join Country where City.CountryCode=Country.Code and Continent = 'North America' and District='St George' group by City.name, Country.Name; |
Can we use a covered index here?
A traditional text-based EXPLAIN already shows that it is a pretty good plan:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | mysql> explain select City.name as city, Country.name as country, group_concat(Language) from City join CountryLanguage using(CountryCode) join Country where City.CountryCode=Country.Code and Continent = 'North America' and District='St George' group by City.name, Country.NameG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: Country partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 239 filtered: 14.29 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: City partitions: NULL type: ref possible_keys: CountryCode key: CountryCode key_len: 3 ref: world.Country.Code rows: 18 filtered: 10.00 Extra: Using where *************************** 3. row *************************** id: 1 select_type: SIMPLE table: CountryLanguage partitions: NULL type: ref possible_keys: PRIMARY,CountryCode key: CountryCode key_len: 3 ref: world.Country.Code rows: 4 filtered: 100.00 Extra: Using index 3 rows in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `world`.`City`.`Name` AS `city`,`world`.`Country`.`Name` AS `country`,group_concat(`world`.`CountryLanguage`.`Language` separator ',') AS `group_concat(Language)` from `world`.`City` join `world`.`CountryLanguage` join `world`.`Country` where ((`world`.`City`.`District` = 'St George') and (`world`.`Country`.`Continent` = 'North America') and (`world`.`City`.`CountryCode` = `world`.`Country`.`Code`) and (`world`.`CountryLanguage`.`CountryCode` = `world`.`Country`.`Code`)) group by `world`.`City`.`Name`,`world`.`Country`.`Name` |
Can we make it better? Since our topic is covered indexes, let’s consider this possibility.
EXPLAIN FORMAT=JSON will tell us to which columns we should add covered index:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | mysql> explain format=json select City.name as city, Country.name as country, group_concat(Language) from City join CountryLanguage using(CountryCode) join Country where City.CountryCode=Country.Code and Continent = 'North America' and District='St George' group by City.name, Country.NameG *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "927.92" }, <I skipped output for groupping operation and other tables here> { "table": { "table_name": "City", "access_type": "ref", "possible_keys": [ "CountryCode" ], "key": "CountryCode", "used_key_parts": [ "CountryCode" ], "key_length": "3", "ref": [ "world.Country.Code" ], "rows_examined_per_scan": 18, "rows_produced_per_join": 63, "filtered": "10.00", "cost_info": { "read_cost": "630.74", "eval_cost": "12.61", "prefix_cost": "810.68", "data_read_per_join": "4K" }, "used_columns": [ "ID", "Name", "CountryCode", "District" ], "attached_condition": "(`world`.`City`.`District` = 'St George')" } }, |
The answer is in the array “used_columns”. It lists the ID (primary key) and all columns which I used in the query:
1 2 3 4 5 6 | "used_columns": [ "ID", "Name", "CountryCode", "District" ], |
Now we can try adding a covered index:
1 2 3 | mysql> alter table City add index cov(CountryCode, District, Name); Query OK, 0 rows affected (2.74 sec) Records: 0 Duplicates: 0 Warnings: 0 |
EXPLAIN confirms what index access (“using_index”: true ) is used:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | mysql> explain format=json select City.name as city, Country.name as country, group_concat(Language) from City join CountryLanguage using(CountryCode) join Country where City.CountryCode=Country.Code and Continent = 'North America' and District='St George' group by City.name, Country.NameG *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "296.28" }, <I skipped output for groupping operation and other tables here> { "table": { "table_name": "City", "access_type": "ref", "possible_keys": [ "CountryCode", "cov" ], "key": "cov", "used_key_parts": [ "CountryCode", "District" ], "key_length": "23", "ref": [ "world.Country.Code", "const" ], "rows_examined_per_scan": 2, "rows_produced_per_join": 100, "filtered": "100.00", "using_index": true, "cost_info": { "read_cost": "34.65", "eval_cost": "20.19", "prefix_cost": "108.64", "data_read_per_join": "7K" }, "used_columns": [ "ID", "Name", "CountryCode", "District" ] } }, |
It also provides such metrics as:
- query_cost – 296.28 for the indexed table against 927.92 (smaller is better)
- rows_examined_per_scan – 2 versus 18 (smaller is better)
- filtered – 100 versus 10 (bigger is better)
- cost_info – read_cost and prefix_cost for the indexed table are smaller than when not indexed, which is better. However, eval_cost and data_read_per_join are bigger. But since we read nine times less rows overall, the cost is still better.
Conclusion: if the number of columns in used_columns array is reasonably small, you can use it as a guide for creating a covered index.
Comments (2)
“used_columns”: [
“ID”,
“Name”,
“CountryCode”,
“District”
],
alter table City add index cov(CountryCode, District, Name);
Is there any dependency of order of columns while adding a covered index with used_columns output ?
I think columns are listed in “used_columns” member in just alphabetical order, therefore you still need check your where clauses (check query to which original was converted in the output, produced by SHOW WARNINGS too) and follow rule what fields should be in same order as they are queried (INDEX(a,b) would work for WHERE a=X or b=Y, but would not for WHERE b=Y or a=X)
Comments are closed.
Use Percona's Technical Forum to ask any follow-up questions on this blog topic.