Ever wondered how fast stored routines are in MySQL? I just ran a quick micro-benchmark to compare the speed of a stored function against a “roughly equivalent” subquery. The idea — and there may be shortcomings that are poisoning the results here, your comments welcome — is to see how fast the SQL procedure code is at doing basically the same thing the subquery code does natively (so to speak).

Before we go further, I want to make sure you know that the queries I’m writing here are deliberately mis-optimized to force a bad execution plan. You should never use IN() subqueries the way I do, at least not in MySQL 5.1 and earlier.

I loaded the World sample database and cooked up this query:

This pretty consistently runs in just about 1/4th of a second. If you look at the abridged explain plan below, you’ll see the query is doing a table scan against the first query, and then executing the subquery for each row:

Now I took the subquery and basically rewrote it as a stored function.

Now the query can be rewritten as this:

If we explain it, we get output similar to the first table shown above, but the further two rows are not shown. The query can’t be optimized to use indexes, and the stored function is opaque to the optimizer. This is why I purposefully wrote the subquery badly in the first query! (If you think of a better way to compare apples and uhm, apples… please comment).

The poorly-optimized-subquery portion of the query essentially happens inside that function now.

And it’s four times slower, consistently, and that’s all I wanted to show here. Thanks for reading.

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Morgan Tocker

– I tried testing this 1 year ago with fibonacci:
http://mtocker.livejournal.com/45222.html

My test wasn’t apples to apples either, but I found that writing stuff in PHP was about ten times faster. If you add in that MySQL stored procedures don’t have a debugger, it’s a tough sell to write pages and pages of business logic in them.

Josh Davis

The stored routine does a COUNT(), unlike the original query. I think you’d be closer to the way the original query works with something like:

DELIMITER //
DROP FUNCTION IF EXISTS speaks_english//
CREATE FUNCTION speaks_english(c char(3)) returns integer deterministic
begin
RETURN (SELECT 1 FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’ AND co.Code = c LIMIT 1);
end//
DELIMITER ;

On my machine it’s still 10 times slower than the subquery, though.

Mark Leith

Well, let’s try a proper stored procedure.. 🙂

I’ll actually mimic what the statement was trying to do – find the population of countries that speak a given language, use a cursor to get the rows for the countries that speak the language (the inner query), and then loop over it:

CREATE PROCEDURE speaks_language (IN in_language VARCHAR(30))
BEGIN

DECLARE v_countryCode CHAR(3);
DECLARE v_countrySpeaksEnglish BIGINT DEFAULT 0;
DECLARE v_sumSpeaksEnglish BIGINT DEFAULT 0;
DECLARE done INT DEFAULT 0;

DECLARE countries CURSOR FOR
SELECT DISTINCT co.Code FROM Country AS co
INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code
WHERE cl.Language = in_language;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

OPEN countries;

REPEAT
FETCH countries INTO v_countryCode;
IF NOT done THEN
SELECT sql_no_cache sum(ci.Population) INTO v_countrySpeaksEnglish
FROM City AS ci WHERE CountryCode = v_countryCode;
SET v_sumSpeaksEnglish = v_sumSpeaksEnglish + IFNULL(v_countrySpeaksEnglish, 0);
END IF;
UNTIL done END REPEAT;

CLOSE countries;

SELECT v_sumSpeaksEnglish;

END//

Here’s the results on my system, including the original query:

mysql> CALL speaks_language(‘English’)//
+——————–+
| v_sumSpeaksEnglish |
+——————–+
| 237134840 |
+——————–+
1 row in set (0.04 sec)

Query OK, 0 rows affected (0.04 sec)

mysql> CALL speaks_language(‘English’)//
+——————–+
| v_sumSpeaksEnglish |
+——————–+
| 237134840 |
+——————–+
1 row in set (0.04 sec)

Query OK, 0 rows affected (0.04 sec)

mysql> CALL speaks_language(‘English’)//
+——————–+
| v_sumSpeaksEnglish |
+——————–+
| 237134840 |
+——————–+
1 row in set (0.05 sec)

Query OK, 0 rows affected (0.05 sec)

mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//+——————–+| sum(ci.Population) |+——————–+
| 237134840 |
+——————–+
1 row in set (0.05 sec)

mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//
+——————–+
| sum(ci.Population) |
+——————–+
| 237134840 |
+——————–+
1 row in set (0.06 sec)

mysql> SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE CountryCode IN ( SELECT DISTINCT co.Code FROM Country AS co INNER JOIN CountryLanguage AS cl ON cl.CountryCode = co.Code WHERE lower(cl.LANGUAGE) = ‘English’);//
+——————–+
| sum(ci.Population) |
+——————–+
| 237134840 |
+——————–+
1 row in set (0.05 sec)

Mark Leith

I should note of course, that I understand that this flips how the statements are run, but it *is* an apples to apples comparison, as it achieves exactly what the statement wanted to achieve, yet does it with different logic (as SPs are intended). 🙂

Tim Little

It occurs to me (and everyone please feel free to correct me without mercy…

But doesn’t the other_table retrieval occur just once per query :
SELECT some_stuff FROM some_table
WHERE a_field IN
( SELECT other_stuff FROM other_table WHERE logic_goes_here )

While my contrasting concern is that the WHERE speaks_english is issued once per row, isn’t it? as :
SELECT sql_no_cache sum(ci.Population) FROM City AS ci WHERE speaks_english(CountryCode)> 0;

Am I missing something, am I correct?