Buy Percona ServicesBuy Now!

grouping_operation, duplicates_removal: EXPLAIN FORMAT=JSON has all details about GROUP BY

 | January 4, 2016 |  Posted In: EXPLAIN, EXPLAIN FORMAT=JSON is cool!, JSON, MySQL


EXPLAIN FORMAT=JSONIn the previous EXPLAIN FORMAT=JSON is Cool! series blog post, we discussed the   group_by_subqueries  member (which is child of grouping_operation). Let’s now focus on the  grouping_operation  and other details of  GROUP BY  processing.

grouping_operation simply shows the details of what happens when the  GROUP BY clause is run:

In the listing above, you can see which table was accessed by the  GROUP BY operation, the access type, and if an index for GROUP BY was used.

In case of a simple JOIN  of two tables,  grouping_operation is usually a parent for the  nested_loop  object (which provides details on how the  JOIN  proceeded):

Surprisingly, while many DISTINCT queries can be converted into equivalent queries with the  GROUP BY clause, there is separate member ( duplicates_removal) for processing it. Let’s see how it works with a simple query that performs the same job as the first one in this blog post:

You can see that the plan is almost same, but parent element for the plan is duplicates_removal.

The reason there are differences between these members can be seen if we change the second, more complicated query to use DISTINCT in place of GROUP BY:

This example shows that  DISTINCT is not exactly same as GROUP BY, and can be used together  if we want to count the number of managers in each department (grouped by the year when the manager started working in the department). In this case, however, we are interested only in unique pairs of such dates and don’t want to see duplicates. Duplicates will appear if one person managed same department more than two years.

In this case, the member grouping_operation is a child of duplicates_removal and the temporary table used to store the result of GROUP BY  before removing the duplicates. A temporary table was also used to perform a filesort for the grouping operation itself.

Compare this with regular EXPLAIN output. EXPLAIN only shows that a temporary table was used, but does not provide insights on the operations for which it was used:

Conclusion:  EXPLAIN FORMAT=JSON contains all the details about the  GROUP BY and DISTINCT  optimizations.

Sveta Smirnova

Sveta joined Percona in 2015. Her main professional interests are problem solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as a Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle. She is the author of the book "MySQL Troubleshooting" and JSON UDF functions for MySQL.

Leave a Reply