Introduction to MySQL 8.0 Common Table Expressions (Part 1)

MySQL Common Table ExpressionsThis blog is the first part of a two-articles series. In this article, I’m going to introduce the Common Table Expression (CTE), a new feature available on MySQL 8.0, as well as Percona Server for MySQL 8.

What is a Common Table Expression?

We can define a CTE as an alternative to a derived table. In a small way, CTE simplifies complex joins and subqueries, improving the readability of the queries. CTE is part of ANSI SQL 99 and was introduced in MySQL 8.0.1. The same feature is available even on Percona Server for MySQL 8.0.

The main reasons for using CTE are:

  • Better readability of the queries
  • Can be referenced multiple times in the same query
  • Improved performance
  • A valid alternative to a VIEW, if your user cannot create VIEWs
  • Easier chaining of multiple CTE
  • Possibility to create recursive queries: this can be really useful when dealing with hierarchical data

SELECT, UPDATE, and DELETE statements can reference the CTE.

Note: for the examples in this article, I’ll use the world database you can download from the MySQL site.

 

How to Create and Use a CTE

We have said that a CTE is like a derived table when using a subquery, but the declaration is moved before the main query. A new dedicated clause is needed: WITH.

Let’s start with a subquery with a derived table to find the most populated countries in Europe:

Let’s rewrite it using the CTE instead:

The syntax is quite simple. Before your query, using WITH, you can define the CTE or even multiple CTEs. After that, you can reference in the query all the CTEs as many times as you need. You can think about a CTE as a sort pre-materialized query available as a temporary table for the scope of your main query.

We can also specify the column names if a parenthesized list of names follows the CTE name:

CTE can also be used as the data source for updating other tables as in the following examples:

CTE can also be used for INSERT … SELECT queries like the following:

Let’s think about a CTE like a temporary table pre-calculated or materialized before the main query. Then you can reference that temporary table as many times you need in your query.

Also, you can create multiple CTEs, and all of them can be used in the main query. The syntax is like the following:

 

Scope of CTE

A CTE can be used even in subqueries, but in such a case, be aware of the scope. The CTE exists for the scope of a single statement.

Consider the following valid queries:

To avoid any trouble with the scope, the best way to use CTEs is by creating all of them at the beginning of the top query. This way, all CTEs can be used wherever you need, multiple times.

 

Chaining multiple CTEs

When creating multiple CTEs for a query, another interesting feature is the chaining. It is possible to define any CTE containing one or more references to previous CTEs in the chain.

The following example shows how the chaining can be used. We would like to find out the countries with the highest and lowest population density in the world. We create a chain of three CTEs. The last two contain a reference to the first one.

 

Let’s think now how you can rewrite the same query using derived tables instead. You would need to copy (several times) the definition of density_by_country. The final query would be really very large and probably less readable.

 

Use CTE Instead of VIEW

It could happen that your database user doesn’t have the right to create a VIEW. A CTE doesn’t require specific grants apart from the capability to read table and columns, the same for any regular SELECT.

Then you can use CTE instead of a VIEW. Apart from the grants options, a CTE has generally better performance than a VIEW, as we’ll show later.

Let’s create a view and run a query using it.

 

Let’s try now to rewrite the same query using the CTE.

 

Taking a look at the two execution plans, we can notice that with the query, there are DERIVED stages. The materialization of the VIEW is needed more times, anytime the view is referenced.

The sample database is small, but we can enable the profiling to compare the execution time of the two queries.

The execution time of the query with the view is around 0.0097 seconds, while with CTE, it is around 0.0054 seconds. So, the CTE is faster. Using larger tables and having more references of the view means the difference between the queries will be more relevant.

Using CTE instead of a VIEW is more efficient because only a single materialization is needed, and the temporary table created can be referenced many times in the main query.

Conclusion

We have introduced the new Common Table Expression feature available on MySQL 8.0. Using CTE, you can simplify, in most cases, the readability of the queries, but you can also use CTE instead of VIEWs to improve the overall performance.

Using CTE, it is also possible to create a recursive query. In the next article of this series, we’ll see examples about how to use recursive CTE to generate series or to query hierarchies.

Share this post

Comments (4)

  • Adnan Reply

    So can we take CTE as an alternative to View ? and what about memory efficiency in CTE ? is it efficient than View or standard sub queries ? if so, then it will give a big benefit on performance optimization.

    February 10, 2020 at 3:08 pm
    • Corrado Pandiani Reply

      Hi Adnan. Basically the execution plan of a view and a CTE is the same if the derived query is referenced only once. The CTE will be significantly better than a view if the derived query is referenced more times. The CTE materialize the query at the beginning anche than can be referenced as many times you need. For the view is different. Any sinle reference of a derived table requires materialization. This involves more memiry usage and worse performance. I hope this helps.

      February 11, 2020 at 9:49 am
  • Øystein Grøvlen Reply

    Note that a CTE is not always materialized. Like a derived table, it may be merged into the main query if it does not contain aggregation, limit, union etc.

    February 13, 2020 at 1:46 pm
    • Corrado Pandiani Reply

      Hi, you’re right. It’s the same as the view. Even for CTE the optimizer can merge the CTE code or materialize it. The aim of the article was to show the usage and the benefit of CTE by the application perspective. Anyway it is worth considering to write anotehr article discussing about internals and the optimizer optimization when using CTE. Thanks

      February 13, 2020 at 2:57 pm

Leave a Reply