Engine-independent persistent statistics with histograms in MariaDB

23 April 2:20pm - 3:10pm @ Ballroom D

The talk is devoted to the module that introduces engine independent
statistical tables into MariaDB and to the new perspectives that it
opens for the optimizer to improve query execution plans for different
engines. The module itself and the additions to the optimizer that allow
to use extended statistical data when looking for optimal query
execution plans are parts of the MariaDB 10.0 branch.

Plan stability is one of the major prerequisites for a robust
development of many database applications. Persistent tables with
non-volatile statistical data that could be modified independently on
the database content will easily allow to reach plan stability.
Many experimental engines can not afford themselves the luxury of
having their own handler functions to provide sound statistical data.
As a result the optimizer behave in a quite unfriendly manner towards
them generating poor execution plans.
More sophisticated algorithms of choosing fast execution plans
require advanced statistical data such as histograms on the distribution
of column values. This data is basically content-dependent and and it
does not make sense to require all storage engines to maintain them.
Advanced engines, such as InnoDB and TokuDB, will greatly benefit from
this extended statistics.


Igor Babaev
Principal Developer, MariaDB Services, Inc
Igor started working on MySQL code in 2002 when he joined MySQL AB as a core developer concentrating on the optimization problems. After Sun's acquisition of MySQL Igor realized that the spirit of big corporations was not for him.That why he eagerly joined Monty Program AB in July 2009 where he still works with his old colleagues from MySQL AB on MariaDB releases. Now he mainly focuses on the improvements of the optimizer cost model. Igor has Ph.D. in Computer Science. He operates from Seattle where he has been living with his family from 1995.