Stored Functions and Temporary Tables are Not a Good Fit

Stored FunctionsIn this blog post, I am going to show why we have to be careful with stored functions in select list, as a single query can cause thousands of queries in the background if we aren’t cautious.

For this example, I am only going to use the SLEEP function to demonstrate the issue, but you could use any other stored functions.

Here is the test schema and data:

And the first query:

The query takes one second, which means the SLEEP was executed after/during the LIMIT step.

The second query creates a temp table:

It takes 5 seconds, meaning the SLEEP was executed for every row before LIMIT, and the only differences between these two queries is the second one uses temp tables. (Yes, the schema and query could be improved to avoid temp tables, but this is only a test case and you can’t avoid temp tables all the time.)

MySQL uses a different order of steps here and does the select list evaluation before/during creating the temp tables.

As you can see, there are functions in the select list, and MySQL will create temp tables, and that could cause thousands of extra queries in the background. I recently had a case where running a single query with functions in the select list resulted in 333,906 queries in the background! In the query, there was a Limit 0,10.

There is a very good blog post describing the order of the operations in SQL, which should be the following:

Based on this, if MySQL strictly follows the SQL order, the LIMIT should be the last, and the first query should take 5s because the Select list evaluation should happen before that for every row. But I guess this is a performance optimization in MySQL, to only run the evaluation for the limited number of rows.

When temp tables are involved, MySQL will use a different logic and do the evaluation before/during the group by when it creates the temp tables.

When MySQL creates a temp table, it has to materialize every single row, and that includes the functions as well. If your query is going to create a huge temp table, MySQL will execute the function for all those rows.

MySQL does not have generated columns in internal temp tables, but what might do the trick here is to just point to a function that needs to be executed when MySQL reads the data out from the temp table.

I have created a ticket, but we are still debating if there is any room for improvement here.

Using DML queries in these functions

If you are using DML (insert/update/delete) queries in stored functions, and you are calling them in the select list, you have to be careful because if MySQL creates temp tables it will call those queries for all the rows in the temp table as well. So you might just want to insert only one row in another table, as you will end up with thousands if not.

How can we avoid this?

If you are facing this issue, you have to analyze your queries. You might only need an extra index to avoid temp tables, or you could rewrite the query.

Where can I see if this is happening?

Unfortunately, in this case, the slow query log cannot help us, because these function calls and the queries from the functions are not logged. This could also be a feature request because it would make debugging much easier. (I have already created a feature request in Percona Server to log queries from stored functions.)

However, in the slow query log for the original query, the Rows_examined would be very high if you are using stored functions and that number is way higher than it should be and might cause problems. Also, the general log can be helpful for investigating the problem here, because in the general log we can see all these function calls as well.

Conclusion

If you are using stored functions in Select list, you have to make sure the query does not use temp tables, otherwise, it could cause many extra queries in the background.

Share this post

Comments (2)

  • Balázs Csaba Reply

    This is a normal behavior, because ordering executed on temp table (by special order algorithm differs from memory qsort 🙂 ) and all rows required before sorting. Limit operation can be run after sorting of course.
    So this is a special sose effect, thx for info.

    PS: explain can show the temp file usage?

    June 26, 2019 at 11:53 am
    • Tibor Korocz Reply

      Szia Balazs,

      Yeah, I know and understand why does this happening, but maybe somehow it could be improved. In this special cases when you have functions in the select list can be very expensive to run the functions for each row.

      Here is the explain for the query which uses temp tables:

      mysql [localhost] {msandbox} (test) > explain SELECT fcol, SLEEP(1) FROM t1 WHERE 1 GROUP BY fcol LIMIT 1\G
      *************************** 1. row ***************************
      id: 1
      select_type: SIMPLE
      table: t1
      partitions: NULL
      type: ALL
      possible_keys: NULL
      key: NULL
      key_len: NULL
      ref: NULL
      rows: 5
      filtered: 100.00
      Extra: Using temporary
      1 row in set, 1 warning (0.00 sec)

      And yes in Extra it says it is going to use a temp table.

      June 26, 2019 at 12:07 pm

Leave a Reply