November 24, 2014

Derived Tables and Views Performance

Starting MySQL 4.1, MySQL had support for what is called derived tables, inline views or basically subselects in the from clause.
In MySQL 5.0 support for views was added.

These features are quite related to each other but how do they compare in terms of performance ?

Derived Tables in MySQL 5.0 seems to have different implementation from views, even though I would expect code base to be merged as it is quite the same task in terms of query optimization.

Derived Tables are still handled by materializing them in the temporary table, furthermore temporary table with no indexes (so you really do not want to join two derived tables for example).

One more thing to watch for is the fact derived table is going to be materialized even to execute EXPLAIN statement. So if you have done mistake in select in from clause, ie forgotten join condition you might have EXPLAIN running forever.

Views on other hand do not have to be materialized and normally executed by rewriting the query. It only will be materialized if query merge is impossible or if requested by view creator.

What does it mean in terms of performance:

So what does it mean in practice:

Avoid derived tables – If there is other way to write the query it will be faster in most cases. In many cases even separate temporary table will be faster as you can add proper indexes to the table in this case.

Consider using temporary views instead of derived tables If you really need to use subselect in from clause consider creating view using it in the query and dropping it after query was executed.

In any case it is pretty annoying gotcha which I hope MySQL will fix in next MySQL versions – the fact queries in this example behave differently is illogical and counter intuitive.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. pabloj says:

    Well probably v 5.2 will be used to make all the new features stable and well integrated ;-)

  2. Raven says:

    Based on your EXPLAIN output, it looks like Views inherit indices from the underlying tables (which would explain the speed difference). Probably the execution of the subselect is faster/less of a memory hog than creating the view, so I can see that you might want the two options to maximize performance in specific situations.

    Do you get similar results in less contrived (more realistic) scenarios?

  3. peter says:

    I would not call it Views Inherit indexes because views are not physical – they have no real data or indexes, they however describe a way to access data.

    The difference is not inheriting indexes but as I wrote different method of execution – Views are executed (in this case) by query rewriting, so effectively query becomes same as on base table. Inline views/derived tables however can’t do it they always have to materialize table.

    Creating view you can also force it to use TEMPTABLE for query execution. See

    http://dev.mysql.com/doc/refman/5.0/en/create-view.html

    Regarding behavior in real cases – this is simplification based on real production cases.

    Note: It applies to subselects in FROM clause only. Other kind of subselects is very different story.

  4. I wonder what happens if you use a view that cannot be easily rewritten.

    eg: GROUP BY or have a SUM in it.

    Now, do a “select * from view where summed_column = 50″ or something.

    Will this be re-written? And if so, what IS the final result of the re-written query?

    Besides, why DO derived tables get materialized anyway? The optimizer is free to re-write queries including these as well, right?


    Martijn Tonies
    Upscene Productions

  5. peter says:

    Martijn,

    First – you’re right. There is no reason for derived tables to be materialized. It is design deficiency in MySQL 5.0.

    Speaking about Views with group by – good question. MySQL is currently able to execute via query merge only simple views, more complicated views as ones with group by require temporary table even if it is possible to avoid it. Hopefully it will be improved in the future:

    mysql> create view v1 as select count(*) cnt, k from test group by k;
    Query OK, 0 rows affected (0.03 sec)

    mysql> explain select * from v2 where k=7 limit 5;
    +—-+————-+————+——-+—————+——+———+——+———+————-+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +—-+————-+————+——-+—————+——+———+——+———+————-+
    | 1 | PRIMARY | | ALL | NULL | NULL | NULL | NULL | 1638400 | Using where |
    | 2 | DERIVED | test | index | NULL | k | 772 | NULL | 1638400 | Using index |
    +—-+————-+————+——-+—————+——+———+——+———+————-+
    2 rows in set (6.88 sec)

    As you see full temporary table is used, there is index on k but it only allows to do index scan instead of full table scan.

  6. Interesting, thanks for trying.


    Martijn Tonies
    Upscene Productions

  7. ari says:

    we are currently using mysql 5.037(27)
    it looks like when you create a view on a table and then use this view in query with a join or where statement with another table indexes
    from a view(first original table) are not used.Run the same query replacing view with a table it is based and result is in several times faster.

  8. javatopia says:

    We ran into this problem just recently on a real world query. In this case the query was of this form:

    select * from (big inner query) foo order by a, b, c

    The choice to use this query syntax was for programming brevity – it’s easier to rename the output columns from the big inner query using the outer query instead. Alas, though, the nasty version of this query took over 2.5 minutes to run on 53,000 record scans. The version without the outer select ran in 9/10th of a second on the same record space.

    The EXPLAIN clearly showed that the outer select was doing a full table scan on the results instead of using the indices.

    Unbelievable. I ran the exact same nasty query in Microsoft SQL Server 2000 and it performed as expectd – lightning fast execution. The execution plan showed the effective use of table indices and much smarter insertion of the sub-query’s grouping clauses.

    MySQL needs quite a bit of real-world tuning to make it a believable enterprise database. You get what you pay for, right?

  9. zblmw says:

    珠宝联盟网

    中国珠宝联盟网(zblmw.com)是一家服务于中国大陆及全球华人社群的领先在线珠宝媒体及增值资讯服务提供商。中国珠宝网站拥有多家地区性网站,以服务大中华地区与海外华人以及珠宝企业为己任,通过为广大网民和政府企业用户提供网络媒体及娱乐、在线用户付费增值/无线增值服务和电子政务解决方案等在内的一系列服务。

    专业珠宝门户——中国珠宝网站预计2008年在全球范围内注册用户超过500万,日浏览量能最高突破8000万次,将成为中国大陆及全球华人社群中最受推崇的行业互联网品牌。

    高效的整合营销服务——凭借领先的技术和优质的服务,中国珠宝网站会深受广大网民的欢迎并能享有极高的声誉。

    http://www.zblmw.com

  10. John Larsen says:

    Ari: Did you try immediently running it again with a different where this is where views shine. Yes unless specifically made not to a view will run the exact select stored in the view. But untill the table it is based upon changes relative to that view a temporary table will be stored containing the full view. As the view temp table won’t need recreating a second run with where would be faster than the second run of a regular select because you have less columns to deal with. A view will generally retain it parents indexes.

  11. John, that is incorrect. Views do not store their data. You are mixing the TEMPTABLE algorithm (which creates an internal temp table for the duration of the query) and the query cache (which stores result sets until underlying tables change or other conditions require discarding them). And views only permit usage of the underlying table’s indexes if the MERGE algorithm is used.

  12. John Larsen says:

    Baron: You know what your right. Somehow in the past I was able to get a view to do this, think it was for a aggregated value (view was an aggregate) which was then trimmed down in queries to the view to be pieces of the aggregate. Was able to get better performance at the time on multiple queries in a row (first query of a connection was slow later queries were instant) then a straight query to the table as the view was holding the aggregate data used for selection of its parts. Can’t seem to figure out how I did it now as I didn’t specify any special notation to it (like TEMPTABLE).

  13. Vincent VAN HOLLEBEKE says:

    Hi,

    Is the problem still existing in actual versions of MySQL, or is it solved ?

    Thank you.

  14. Clayton Stanley says:

    I experienced this problem on OS X 10.6.8 with the most recent version of MySQL (5.5.27). It still exists. Derived tables (at least for me) are not indexed. I’m no longer using derived tables, and using views (which are indexed) instead.

  15. rikki says:

    i would appreciate your comments on this script: taken from:
    http://phptechnicalgroups.blogspot.co.il/2013/05/simple-mysql-and-php-prodcuts-and-cart.html
    create dynamic main menu and sub menu using php and mysql
    <?php
    CREATE TABLE menu (
    id int(11) NOT NULL auto_increment,
    label varchar(50) NOT NULL default '',
    link varchar(100) NOT NULL default '#',
    parent int(11) NOT NULL default '0',
    sort int(11) default NULL,
    PRIMARY KEY (id))
    ——————————————————————————————————
    $mysql=mysql_connect('127.0.0.1','root','');
    mysql_select_db('test',$mysql);
    function display_menu($parent, $level) {
    $result = mysql_query("SELECT a.id, a.label, a.link, Deriv1.Count FROM menu a LEFT OUTER JOIN (SELECT parent, COUNT(*) AS Count FROM menu GROUP BY parent) Deriv1 ON a.id = Deriv1.parent WHERE a.parent=" . $parent);
    echo "”;
    while ($row = mysql_fetch_assoc($result)) {
    if ($row[‘Count’] > 0) {
    echo “” . $row[‘label’] . ““;
    display_menu($row[‘id’], $level + 1);
    echo “”;
    } elseif ($row[‘Count’]==0) {
    echo “” . $row[‘label’] . ““;
    } else;
    }
    echo “”;
    }
    display_menu(0, 1);
    ?>
    Posted by bikash ranajan nayak at 10:30 AM

Speak Your Mind

*