Introduction to MySQL 8.0 Recursive Common Table Expression (Part 2)

MySQL 8.0 Recursive Common Table ExpressionsThis is the second part of a two-articles series. In the first part, we introduced the Common Table Expression (CTE), a new feature available on MySQL 8.0 as well as Percona Server for MySQL 8.0. In this article, we’ll present the Recursive Common Table Expression. SQL is generally poor at recursive structures, but it is now possible on MySQL to write recursive queries. Before MySQL 8.0, recursion was possible only by creating stored routines.

What is a Recursive Common Table Expression?

A recursive CTE is one having a subquery that refers to its own name. It is particularly useful in the following cases:

  • To generate series
  • Hierarchical or tree-structured data traversal

Let’s see the main components of a recursive CTE. The following is the syntax to create it:

First of all, the clause RECURSIVE is mandatory, and then there are two mandatory components. The seed member is the initial query, the one that will be executed at the first iteration. The recursive member is the query containing the reference to the same CTE name. This second component will generate all the remaining items of the main query.

The process stops when an iteration does not generate any rows. Be aware of that in order to avoid generating a lot of iterations that can exhaust the memory.

It is important for recursive CTEs that the recursive member includes a condition to terminate the recursion. As a development technique you can force termination by placing a limit on execution time:

  • The cte_max_recursion_depth system variable enforces a limit on the number of recursion levels for CTEs. The server terminates the execution of any CTE that recurses more levels than the value of this variable. The default value is 1000.
  • The max_execution_time system variable enforces an execution timeout for SELECT statements executed within the current session.
  • The MAX_EXECUTION_TIME optimizer hint enforces a per-query execution timeout for the SELECT statement in which it appears.

 

Generate Series

Let’s see now some simple usage of Recursive CTE to generate series.

One-Level Sequence

First, create a simple series of integer numbers from 1 to 10. This a one-level sequence because the N+1 value is a function of the previous one N only.

Another typical example is calculating the factorial.

 

Two-Level Sequence

In this case, we would like to create a two-level sequence where the N+2 value is a function of the two previous values N+1 and N.

The typical example here is the Fibonacci Series; each number is the sum of the two preceding ones, starting from 0 and 1.  Let’s calculate the first 20 items of the Fibonacci series.

 

Date Sequence

Let’s consider having a simple table containing our shop’s sales such as the following:

Notice, however, that our sales report has missing dates: Feb 3rd and Feb 5th. We would like to generate a report including even the dates with no sales.

A recursive CTE can help.

 

Hierarchical Data Traversal

Let’s take a look now at some other use cases for recursive CTE: a simple tree for an Org Chart, a more complex tree for family genealogy and a graph for train paths, of the following picture.

A Simple Tree: Org Chart

 

 

Let’s run some queries using recursive CTE to traverse this kind of hierarchy.

Please note the usage of the CAST function on the “seed” member of the CTE. This was done on purpose. Let’s look what happens in case you don’t use the CAST function:

Why an error? The query is, in theory, correct, but the problem is that the type of column path is determined from the non-recursive SELECT only, and so it is CHAR(7) (Matthew length). On the recursive part of the CTE it would cause a character truncation, so: error!

Let’s look at a query to traverse the tree and calculate the level of the employees in the Org Chart.

 

A More Complex Tree: Genealogy

Creating a table to represent the following genealogy with grandparents, parents, and sons.

Let’s find all of James’s ancestors and the relationship:

Using the same query but changing the initial condition we can find out the ancestors of anyone in the hierarchy, for example, Jennifer:

 

A Graph: Train Routes

Let’s create a graph representing train routes in Italy for the more important cities, from the image below:

 

Be aware of uni-directional and bi-directional connections. Each connection also has a distance in km.

Returning all the train destinations with Milan as the origin:

Basically starting from any city, you can go wherever you want in Italy, but there are different paths. So let’s run a query to find out all the possible paths, and the total length of each, starting from Milan and Naples.

 

It’s quite easy now to find out which is the shortest path from one origin to any final destination. You just need to filter and order the main query. Here are some examples:

 

Limitations

Apart from the limitations we have already seen for limiting the execution time and the number of iterations, there are other built-in limitations you should be aware of.

The recursive SELECT must not contain the following constructs:

  • An aggregate function such as SUM()
  • GROUP BY
  • ORDER BY
  • DISTINCT
  • Window functions

These limitations are not valid for non-recursive CTE. Also, the recursive SELECT part must reference the CTE only once and only in its FROM clause, not in any subquery.

Conclusion

Recursive common table expression is a new interesting feature to implement queries for your applications using MySQL 8.0. Recursion was already possible in the past by creating stored routines but now it’s simpler. Furthermore, you don’t need special and additional grants to create a recursive query.

Generally, recursive CTE is quite simple, but compared to non-recursive CTE, it is a little more complicated. Recursive CTE is more tricky because of recursion, obviously. It’s not a matter of syntax, of course, it’s only a matter of “thinking recursively”.

Share this post

Comments (2)

Leave a Reply