Ready for another post in the EXPLAIN FORMAT=JSON is Cool series! Great! This post will discuss how to see all the information that is contained in optimized queries with
UNION using the
union_result and
query_specifications commands.
When optimizing complicated queries with UNION, it is easy to get lost in the regular EXPLAIN output trying to identify which part of the output belongs to each part of the UNION.
Let’s consider the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
mysql> explain -> select emp_no, last_name, 'low_salary' from employees -> where emp_no in (select emp_no from salaries -> where salary < (select avg(salary) from salaries)) -> union -> select emp_no, last_name, 'high salary' from employees -> where emp_no in (select emp_no from salaries -> where salary >= (select avg(salary) from salaries))G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: employees partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 299778 filtered: 100.00 Extra: NULL *************************** 2. row *************************** id: 1 select_type: PRIMARY table: salaries partitions: NULL type: ref possible_keys: PRIMARY,emp_no key: PRIMARY key_len: 4 ref: employees.employees.emp_no rows: 9 filtered: 33.33 Extra: Using where; FirstMatch(employees) *************************** 3. row *************************** id: 3 select_type: SUBQUERY table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: NULL *************************** 4. row *************************** id: 4 select_type: UNION table: employees partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 299778 filtered: 100.00 Extra: NULL *************************** 5. row *************************** id: 4 select_type: UNION table: salaries partitions: NULL type: ref possible_keys: PRIMARY,emp_no key: PRIMARY key_len: 4 ref: employees.employees.emp_no rows: 9 filtered: 33.33 Extra: Using where; FirstMatch(employees) *************************** 6. row *************************** id: 6 select_type: SUBQUERY table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: NULL *************************** 7. row *************************** id: NULL select_type: UNION RESULT table: <union1,4> partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: Using temporary 7 rows in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) |
While we can guess that subquery 3 belongs to the first query of the union, and subquery 6 belongs to the second (which has number 4 for some reason), we have to be very careful (especially in our case) when queries use the same tables in both parts of the UNION.
The main issue with the regular EXPLAIN for UNION is that it has to re-present the hierarchical structure as a table. The same issue occurs when you want to store objects created in programming language, such as Java, in the database.
EXPLAIN FORMAT=JSON, on the other hand, has hierarchical structure and more clearly displays how UNION was optimized:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
mysql> explain format=json select emp_no, last_name, 'low_salary' from employees where emp_no in (select emp_no from salaries where salary < (select avg(salary) from salaries)) union select emp_no, last_name, 'high salary' from employees where emp_no in (select emp_no from salaries where salary >= (select avg(salary) from salaries))G *************************** 1. row *************************** EXPLAIN: { "query_block": { "union_result": { "using_temporary_table": true, "table_name": "<union1,4>", "access_type": "ALL", "query_specifications": [ { "dependent": false, "cacheable": true, "query_block": { "select_id": 1, "cost_info": { "query_cost": "921684.48" }, "nested_loop": [ { "table": { "table_name": "employees", "access_type": "ALL", "possible_keys": [ "PRIMARY" ], "rows_examined_per_scan": 299778, "rows_produced_per_join": 299778, "filtered": "100.00", "cost_info": { "read_cost": "929.00", "eval_cost": "59955.60", "prefix_cost": "60884.60", "data_read_per_join": "13M" }, "used_columns": [ "emp_no", "last_name" ] } }, { "table": { "table_name": "salaries", "access_type": "ref", "possible_keys": [ "PRIMARY", "emp_no" ], "key": "PRIMARY", "used_key_parts": [ "emp_no" ], "key_length": "4", "ref": [ "employees.employees.emp_no" ], "rows_examined_per_scan": 9, "rows_produced_per_join": 299778, "filtered": "33.33", "first_match": "employees", "cost_info": { "read_cost": "302445.97", "eval_cost": "59955.60", "prefix_cost": "921684.48", "data_read_per_join": "4M" }, "used_columns": [ "emp_no", "salary" ], "attached_condition": "(`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))", "attached_subqueries": [ { "dependent": false, "cacheable": true, "query_block": { "select_id": 3, "cost_info": { "query_cost": "516948.40" }, "table": { "table_name": "salaries", "access_type": "ALL", "rows_examined_per_scan": 2557022, "rows_produced_per_join": 2557022, "filtered": "100.00", "cost_info": { "read_cost": "5544.00", "eval_cost": "511404.40", "prefix_cost": "516948.40", "data_read_per_join": "39M" }, "used_columns": [ "salary" ] } } } ] } } ] } }, { "dependent": false, "cacheable": true, "query_block": { "select_id": 4, "cost_info": { "query_cost": "921684.48" }, "nested_loop": [ { "table": { "table_name": "employees", "access_type": "ALL", "possible_keys": [ "PRIMARY" ], "rows_examined_per_scan": 299778, "rows_produced_per_join": 299778, "filtered": "100.00", "cost_info": { "read_cost": "929.00", "eval_cost": "59955.60", "prefix_cost": "60884.60", "data_read_per_join": "13M" }, "used_columns": [ "emp_no", "last_name" ] } }, { "table": { "table_name": "salaries", "access_type": "ref", "possible_keys": [ "PRIMARY", "emp_no" ], "key": "PRIMARY", "used_key_parts": [ "emp_no" ], "key_length": "4", "ref": [ "employees.employees.emp_no" ], "rows_examined_per_scan": 9, "rows_produced_per_join": 299778, "filtered": "33.33", "first_match": "employees", "cost_info": { "read_cost": "302445.97", "eval_cost": "59955.60", "prefix_cost": "921684.48", "data_read_per_join": "4M" }, "used_columns": [ "emp_no", "salary" ], "attached_condition": "(`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))", "attached_subqueries": [ { "dependent": false, "cacheable": true, "query_block": { "select_id": 6, "cost_info": { "query_cost": "516948.40" }, "table": { "table_name": "salaries", "access_type": "ALL", "rows_examined_per_scan": 2557022, "rows_produced_per_join": 2557022, "filtered": "100.00", "cost_info": { "read_cost": "5544.00", "eval_cost": "511404.40", "prefix_cost": "516948.40", "data_read_per_join": "39M" }, "used_columns": [ "salary" ] } } } ] } } ] } } ] } } } 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) |
First it puts member union_result in the query_block at the very top level:
1 2 3 |
EXPLAIN: { "query_block": { "union_result": { |
The union_result object contains information about how the result set of the UNION was processed:
1 2 3 |
"using_temporary_table": true, "table_name": "<union1,4>", "access_type": "ALL", |
And also contains the query_specifications array which also contains all the details about queries in the UNION:
1 2 3 4 5 6 7 8 9 10 11 12 |
"query_specifications": [ { "dependent": false, "cacheable": true, "query_block": { "select_id": 1, <skipped> { "dependent": false, "cacheable": true, "query_block": { "select_id": 4, |
This representation is much more clear, and also contains all the details which the regular EXPLAIN misses for regular queries.
Conclusion: EXPLAIN FORMAT=JSON not only contains additional optimization information for each query in the UNION, but also has a hierarchical structure that is more suitable for the hierarchical nature of the UNION.