Ready for another post in the EXPLAIN FORMAT=JSON is Cool series! Great! This post will discuss how to see all the information that is contained in optimized queries with
UNION using the
union_result and
query_specifications commands.
When optimizing complicated queries with UNION, it is easy to get lost in the regular EXPLAIN output trying to identify which part of the output belongs to each part of the UNION.
Let’s consider the following example:
|
1 |
mysql> explain <br> -> select emp_no, last_name, 'low_salary' from employees <br> -> where emp_no in (select emp_no from salaries <br> -> where salary < (select avg(salary) from salaries)) <br> -> union <br> -> select emp_no, last_name, 'high salary' from employees <br> -> where emp_no in (select emp_no from salaries <br> -> where salary >= (select avg(salary) from salaries))G<br>*************************** 1. row ***************************<br> id: 1<br> select_type: PRIMARY<br> table: employees<br> partitions: NULL<br> type: ALL<br>possible_keys: PRIMARY<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: 299778<br> filtered: 100.00<br> Extra: NULL<br>*************************** 2. row ***************************<br> id: 1<br> select_type: PRIMARY<br> table: salaries<br> partitions: NULL<br> type: ref<br>possible_keys: PRIMARY,emp_no<br> key: PRIMARY<br> key_len: 4<br> ref: employees.employees.emp_no<br> rows: 9<br> filtered: 33.33<br> Extra: Using where; FirstMatch(employees)<br>*************************** 3. row ***************************<br> id: 3<br> select_type: SUBQUERY<br> table: salaries<br> partitions: NULL<br> type: ALL<br>possible_keys: NULL<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: 2557022<br> filtered: 100.00<br> Extra: NULL<br>*************************** 4. row ***************************<br> id: 4<br> select_type: UNION<br> table: employees<br> partitions: NULL<br> type: ALL<br>possible_keys: PRIMARY<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: 299778<br> filtered: 100.00<br> Extra: NULL<br>*************************** 5. row ***************************<br> id: 4<br> select_type: UNION<br> table: salaries<br> partitions: NULL<br> type: ref<br>possible_keys: PRIMARY,emp_no<br> key: PRIMARY<br> key_len: 4<br> ref: employees.employees.emp_no<br> rows: 9<br> filtered: 33.33<br> Extra: Using where; FirstMatch(employees)<br>*************************** 6. row ***************************<br> id: 6<br> select_type: SUBQUERY<br> table: salaries<br> partitions: NULL<br> type: ALL<br>possible_keys: NULL<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: 2557022<br> filtered: 100.00<br> Extra: NULL<br>*************************** 7. row ***************************<br> id: NULL<br> select_type: UNION RESULT<br> table: <union1,4><br> partitions: NULL<br> type: ALL<br>possible_keys: NULL<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: NULL<br> filtered: NULL<br> Extra: Using temporary<br>7 rows in set, 1 warning (0.00 sec)<br><br>Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) |
While we can guess that subquery 3 belongs to the first query of the union, and subquery 6 belongs to the second (which has number 4 for some reason), we have to be very careful (especially in our case) when queries use the same tables in both parts of the UNION.
The main issue with the regular EXPLAIN for UNION is that it has to re-present the hierarchical structure as a table. The same issue occurs when you want to store objects created in programming language, such as Java, in the database.
EXPLAIN FORMAT=JSON, on the other hand, has hierarchical structure and more clearly displays how UNION was optimized:
|
1 |
mysql> explain format=json select emp_no, last_name, 'low_salary' from employees where emp_no in (select emp_no from salaries where salary < (select avg(salary) from salaries)) union select emp_no, last_name, 'high salary' from employees where emp_no in (select emp_no from salaries where salary >= (select avg(salary) from salaries))G<br>*************************** 1. row ***************************<br>EXPLAIN: {<br> "query_block": {<br> "union_result": {<br> "using_temporary_table": true,<br> "table_name": "<union1,4>",<br> "access_type": "ALL",<br> "query_specifications": [<br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 1,<br> "cost_info": {<br> "query_cost": "921684.48"<br> },<br> "nested_loop": [<br> {<br> "table": {<br> "table_name": "employees",<br> "access_type": "ALL",<br> "possible_keys": [<br> "PRIMARY"<br> ],<br> "rows_examined_per_scan": 299778,<br> "rows_produced_per_join": 299778,<br> "filtered": "100.00",<br> "cost_info": {<br> "read_cost": "929.00",<br> "eval_cost": "59955.60",<br> "prefix_cost": "60884.60",<br> "data_read_per_join": "13M"<br> },<br> "used_columns": [<br> "emp_no",<br> "last_name"<br> ]<br> }<br> },<br> {<br> "table": {<br> "table_name": "salaries",<br> "access_type": "ref",<br> "possible_keys": [<br> "PRIMARY",<br> "emp_no"<br> ],<br> "key": "PRIMARY",<br> "used_key_parts": [<br> "emp_no"<br> ],<br> "key_length": "4",<br> "ref": [<br> "employees.employees.emp_no"<br> ],<br> "rows_examined_per_scan": 9,<br> "rows_produced_per_join": 299778,<br> "filtered": "33.33",<br> "first_match": "employees",<br> "cost_info": {<br> "read_cost": "302445.97",<br> "eval_cost": "59955.60",<br> "prefix_cost": "921684.48",<br> "data_read_per_join": "4M"<br> },<br> "used_columns": [<br> "emp_no",<br> "salary"<br> ],<br> "attached_condition": "(`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",<br> "attached_subqueries": [<br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 3,<br> "cost_info": {<br> "query_cost": "516948.40"<br> },<br> "table": {<br> "table_name": "salaries",<br> "access_type": "ALL",<br> "rows_examined_per_scan": 2557022,<br> "rows_produced_per_join": 2557022,<br> "filtered": "100.00",<br> "cost_info": {<br> "read_cost": "5544.00",<br> "eval_cost": "511404.40",<br> "prefix_cost": "516948.40",<br> "data_read_per_join": "39M"<br> },<br> "used_columns": [<br> "salary"<br> ]<br> }<br> }<br> }<br> ]<br> }<br> }<br> ]<br> }<br> },<br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 4,<br> "cost_info": {<br> "query_cost": "921684.48"<br> },<br> "nested_loop": [<br> {<br> "table": {<br> "table_name": "employees",<br> "access_type": "ALL",<br> "possible_keys": [<br> "PRIMARY"<br> ],<br> "rows_examined_per_scan": 299778,<br> "rows_produced_per_join": 299778,<br> "filtered": "100.00",<br> "cost_info": {<br> "read_cost": "929.00",<br> "eval_cost": "59955.60",<br> "prefix_cost": "60884.60",<br> "data_read_per_join": "13M"<br> },<br> "used_columns": [<br> "emp_no",<br> "last_name"<br> ]<br> }<br> },<br> {<br> "table": {<br> "table_name": "salaries",<br> "access_type": "ref",<br> "possible_keys": [<br> "PRIMARY",<br> "emp_no"<br> ],<br> "key": "PRIMARY",<br> "used_key_parts": [<br> "emp_no"<br> ],<br> "key_length": "4",<br> "ref": [<br> "employees.employees.emp_no"<br> ],<br> "rows_examined_per_scan": 9,<br> "rows_produced_per_join": 299778,<br> "filtered": "33.33",<br> "first_match": "employees",<br> "cost_info": {<br> "read_cost": "302445.97",<br> "eval_cost": "59955.60",<br> "prefix_cost": "921684.48",<br> "data_read_per_join": "4M"<br> },<br> "used_columns": [<br> "emp_no",<br> "salary"<br> ],<br> "attached_condition": "(`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",<br> "attached_subqueries": [<br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 6,<br> "cost_info": {<br> "query_cost": "516948.40"<br> },<br> "table": {<br> "table_name": "salaries",<br> "access_type": "ALL",<br> "rows_examined_per_scan": 2557022,<br> "rows_produced_per_join": 2557022,<br> "filtered": "100.00",<br> "cost_info": {<br> "read_cost": "5544.00",<br> "eval_cost": "511404.40",<br> "prefix_cost": "516948.40",<br> "data_read_per_join": "39M"<br> },<br> "used_columns": [<br> "salary"<br> ]<br> }<br> }<br> }<br> ]<br> }<br> }<br> ]<br> }<br> }<br> ]<br> }<br> }<br>}<br>1 row in set, 1 warning (0.00 sec)<br><br>Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) |
First it puts member union_result in the query_block at the very top level:
|
1 |
EXPLAIN: {<br> "query_block": {<br> "union_result": { |
The union_result object contains information about how the result set of the UNION was processed:
|
1 |
"using_temporary_table": true,<br> "table_name": "<union1,4>",<br> "access_type": "ALL", |
And also contains the query_specifications array which also contains all the details about queries in the UNION:
|
1 |
"query_specifications": [<br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 1,<br><skipped><br> {<br> "dependent": false,<br> "cacheable": true,<br> "query_block": {<br> "select_id": 4, |
This representation is much more clear, and also contains all the details which the regular EXPLAIN misses for regular queries.
Conclusion: EXPLAIN FORMAT=JSON not only contains additional optimization information for each query in the UNION, but also has a hierarchical structure that is more suitable for the hierarchical nature of the UNION.
Resources
RELATED POSTS