Use EXPLAIN FORMAT=JSON to Debug Complex MySQL Queries

EXPLAIN FORMAT=JSON

The regular EXPLAIN command already provides some information about subquery optimization. For example, you can find out if the subquery is dependent or not, and (since version 5.6) if it was materialized:

mysql> explain select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: departments
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: dept_name
      key_len: 42
          ref: NULL
         rows: 9
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: <subquery2>
   partitions: NULL
         type: eq_ref
possible_keys: <auto_key>
          key: <auto_key>
      key_len: 4
          ref: employees.departments.dept_no
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
           id: 2
  select_type: MATERIALIZED
        table: dept_manager
   partitions: NULL
         type: ALL
possible_keys: dept_no
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 24
     filtered: 90.00
        Extra: Using where
3 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

mysql> explain select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: departments

partitions: NULL

type: index

possible_keys: PRIMARY

key: dept_name

key_len: 42

ref: NULL

rows: 9

filtered: 100.00

Extra: Using where; Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: <subquery2>

partitions: NULL

type: eq_ref

possible_keys: <auto_key>

key: <auto_key>

key_len: 4

ref: employees.departments.dept_no

rows: 1

filtered: 100.00

Extra: NULL

*************************** 3. row ***************************

id: 2

select_type: MATERIALIZED

table: dept_manager

partitions: NULL

type: ALL

possible_keys: dept_no

key: NULL

key_len: NULL

ref: NULL

rows: 24

filtered: 90.00

Extra: Using where

3 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

However, you can’t find details on exactly how this subquery was materialized. To find out more, use EXPLAIN FORMAT=JSON:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "16.72"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "departments",
          <skipped>
      },
      {
        "table": {
          "table_name": "<subquery2>",
          "access_type": "eq_ref",
          "key": "<auto_key>",
          "key_length": "4",
          "ref": [
            "employees.departments.dept_no"
          ],
          "rows_examined_per_scan": 1,
          "materialized_from_subquery": {
            "using_temporary_table": true,
            "query_block": {
              "table": {
                "table_name": "dept_manager",
                "access_type": "ALL",
                "possible_keys": [
                  "dept_no"
                ],
                "rows_examined_per_scan": 24,
                "rows_produced_per_join": 21,
                "filtered": "90.00",
                "cost_info": {
                  "read_cost": "1.48",
                  "eval_cost": "4.32",
                  "prefix_cost": "5.80",
                  "data_read_per_join": "345"
                },
                "used_columns": [
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
              }
            }
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G

*************************** 1. row ***************************

EXPLAIN: {

"query_block": {

"select_id": 1,

"cost_info": {

"query_cost": "16.72"

"nested_loop": [

{

"table": {

"table_name": "departments",

{

"table": {

"table_name": "<subquery2>",

"access_type": "eq_ref",

"key": "<auto_key>",

"key_length": "4",

"ref": [

"employees.departments.dept_no"

"rows_examined_per_scan": 1,

"materialized_from_subquery": {

"using_temporary_table": true,

"query_block": {

"table": {

"table_name": "dept_manager",

"access_type": "ALL",

"possible_keys": [

"dept_no"

"rows_examined_per_scan": 24,

"rows_produced_per_join": 21,

"filtered": "90.00",

"cost_info": {

"read_cost": "1.48",

"eval_cost": "4.32",

"prefix_cost": "5.80",

"data_read_per_join": "345"

"used_columns": [

"dept_no",

"to_date"

"attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"

}

]

}

1 row in set, 1 warning (0.00 sec)

We can see here that the new temporary table ( "using_temporary_table": true) with internal name <subquery2> was materilized from the subquery ( materialized_from_subquery). We also see how this new temporary table was accessed.

We also can see that the member that contains information about the materialized subquery is part of the nested_loop array. Nested loop optimization applies to a regular JOIN command. In this case, it means that the MySQL Server would join the temporary table with an outer query.

Another interesting case is optimized_away_subqueries: this command shows subqueries that were executed only once and were replaced by their result. A regular EXPLAIN command provides no information about such optimization:

mysql> explain select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838525
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary` from `employees`.`salaries` order by (/* select#2 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)

mysql> explain select emp_no, salary from salaries order by (select max(salary) from salaries)G

*************************** 1. row ***************************

id: 1

select_type: PRIMARY

table: salaries

partitions: NULL

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 2838525

filtered: 100.00

Extra: NULL

*************************** 2. row ***************************

id: 2

select_type: SUBQUERY

table: salaries

partitions: NULL

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 2838525

filtered: 100.00

Extra: NULL

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary` from `employees`.`salaries` order by (/* select#2 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)

We can see that in select_type: SUBQUERY, the output of SHOW WARNINGS doesn’t give us much insight either.

However, EXPLAIN FORMAT=JSON tells us how this subquery was optimized:

mysql> explain format=json select emp_no, salary from salaries order by (select max(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "573505.00"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "salaries",
        
        <skipped>
        
      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "573505.00"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 2838525,
              "rows_produced_per_join": 2838525,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "5800.00",
                "eval_cost": "567705.00",
                "prefix_cost": "573505.00",
                "data_read_per_join": "43M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

mysql> explain format=json select emp_no, salary from salaries order by (select max(salary) from salaries)G

*************************** 1. row ***************************

EXPLAIN: {

"query_block": {

"select_id": 1,

"cost_info": {

"query_cost": "573505.00"

"ordering_operation": {

"using_filesort": false,

"table": {

"table_name": "salaries",

"optimized_away_subqueries": [

{

"dependent": false,

"cacheable": true,

"query_block": {

"select_id": 2,

"cost_info": {

"query_cost": "573505.00"

"table": {

"table_name": "salaries",

"access_type": "ALL",

"rows_examined_per_scan": 2838525,

"rows_produced_per_join": 2838525,

"filtered": "100.00",

"cost_info": {

"read_cost": "5800.00",

"eval_cost": "567705.00",

"prefix_cost": "573505.00",

"data_read_per_join": "43M"

"used_columns": [

"salary"

]

}

]

}

1 row in set, 1 warning (0.00 sec)

After seeing optimized_away_subqueries and "cacheable": true, we know that the subquery was executed only once, and the result was cached and used to resolve the outer query.

Another subquery type that I want to mention in this post is attached_subqueries – which covers subqueries that are not converted to JOIN.

mysql> explain format=json select emp_no from salaries where salary > (select avg(salary) from salaries)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "250601.60"
    },
    "table": {
      "table_name": "salaries",
      "access_type": "ALL",
      "rows_examined_per_scan": 1240668,
      "rows_produced_per_join": 413514,
      "filtered": "33.33",
      "cost_info": {
        "read_cost": "167898.67",
        "eval_cost": "82702.92",
        "prefix_cost": "250601.60",
        "data_read_per_join": "6M"
      },
      "used_columns": [
        "emp_no",
        "salary"
      ],
      "attached_condition": "(`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",
      "attached_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "250601.80"
            },
            "table": {
              "table_name": "salaries",
              "access_type": "ALL",
              "rows_examined_per_scan": 1240669,
              "rows_produced_per_join": 1240669,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "2468.00",
                "eval_cost": "248133.80",
                "prefix_cost": "250601.80",
                "data_read_per_join": "18M"
              },
              "used_columns": [
                "salary"
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no` from `employees`.`salaries` where (`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))

mysql> explain format=json select emp_no from salaries where salary > (select avg(salary) from salaries)G

*************************** 1. row ***************************

EXPLAIN: {

"query_block": {

"select_id": 1,

"cost_info": {

"query_cost": "250601.60"

"table": {

"table_name": "salaries",

"access_type": "ALL",

"rows_examined_per_scan": 1240668,

"rows_produced_per_join": 413514,

"filtered": "33.33",

"cost_info": {

"read_cost": "167898.67",

"eval_cost": "82702.92",

"prefix_cost": "250601.60",

"data_read_per_join": "6M"

"used_columns": [

"emp_no",

"salary"

"attached_condition": "(`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",

"attached_subqueries": [

{

"dependent": false,

"cacheable": true,

"query_block": {

"select_id": 2,

"cost_info": {

"query_cost": "250601.80"

"table": {

"table_name": "salaries",

"access_type": "ALL",

"rows_examined_per_scan": 1240669,

"rows_produced_per_join": 1240669,

"filtered": "100.00",

"cost_info": {

"read_cost": "2468.00",

"eval_cost": "248133.80",

"prefix_cost": "250601.80",

"data_read_per_join": "18M"

"used_columns": [

"salary"

]

}

]

}

1 row in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no` from `employees`.`salaries` where (`employees`.`salaries`.`salary` > (/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))

For these subqueries, using this option shows if they are dependent and/or cacheable. There is also a query_block member inside the attached_subqueries object which provides all the details on how the query was optimized.

Conclusion: EXPLAIN FORMAT=JSON provides some great details on subqueries optimization.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Leonid Sokolovskiy

7 years ago

Sveta, could you point to a location where I can read a bit more about json format.
Other question – are there tools which helps to visualise json plan. I came from SQL Server background and there are number of great free tools to visualise and provides stats about execution plan.
One more – probably very ignorant question :). Am I right that produced plan is actually estimated and not the actual plan?
In SQL Server you can get both but actual is much more useful because you get the real and not estimated numbers

Lorraine Pocklington, Community Manager

Reply to Leonid Sokolovskiy

7 years ago

— I’m answering on behalf of Sveta while we sort out commenting on these older posts.
— Bear with us!
Hi Leonid,

> Sveta, could you point to a location where I can read a bit more about json format.

JSON format is defined at json.org

> Other question – are there tools which helps to visualise json plan. I came from SQL Server background and there are number of great free tools to visualise and provides stats about execution plan.

MySQL Workbench (https://www.mysql.com/products/workbench/) has visual EXPLAIN feature and uses EXPLAIN FORMAT=JSON as underlying source for it.

> One more – probably very ignorant question :). Am I right that produced plan is actually estimated and not the actual plan?
> In SQL Server you can get both but actual is much more useful because you get the real and not estimated numbers

Exactly: only estimates. If you want to find information about actual produced plan you have few options:

– Use table optimizer_trace in the information_schema: https://dev.mysql.com/doc/refman/5.7/en/optimizer-trace-table.html
– Use EXPLAIN FOR CONNECTION: https://dev.mysql.com/doc/refman/5.7/en/explain.html While this is still “normal” EXPLAIN once applied to the running query it may show actually executed plan
– Use ANALYZE in MariaDB: https://mariadb.com/kb/en/library/analyze-statement/

You can also examine values of status variables those names are starting with ‘Handler_’.

You may find some information in this webinar: https://www.percona.com/blog/2016/05/20/introduction-troubleshooting-performance-troubleshooting-slow-queries-webinar-q/

MySQL 5.7
Support

Compare Percona to Leading Database Solutions

Software
Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

EXPLAIN FORMAT=JSON: everything about attached_subqueries, optimized_away_subqueries, materialized_from_subquery

EXPLAIN FORMAT=JSON

Related Blog Articles

RECOMMENDED ARTICLES

Rebuilding a Replica with MyDumper

Databases, Data Lakes, And Encryption

Automatic “Multi-Source” Async Replication Failover Using PXC Replication Manager

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7 Support

Compare Percona to Leading Database Solutions

Software Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

EXPLAIN FORMAT=JSON: everything about attached_subqueries, optimized_away_subqueries, materialized_from_subquery

EXPLAIN FORMAT=JSON

About the Author

Share This Post!

Stay up to date with the Percona Blog

Related Blog Articles

RECOMMENDED ARTICLES

Rebuilding a Replica with MyDumper

Databases, Data Lakes, And Encryption

Automatic “Multi-Source” Async Replication Failover Using PXC Replication Manager

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7
Support

Software
Downloads