Sveta Smirnova (Percona) delivers the talk, "Billion Goods in Few Categories: How Histograms Save a Life?", on DAY 2 of the Percona Live Open Source Database Conference 2019, 5/30, at Austin, TX.
We store data with an intention to use it: search, retrieve, group, sort... To do it effectively, the MySQL Optimizer uses index statistics when it compiles the query execution plan. This approach works excellently unless your data distribution is not even.
Last year I worked on several support tickets where data follows the same pattern: millions of popular products fit into a couple of categories and the rest used the rest. We had a hard time finding a solution for retrieving goods fast. We offered workarounds for version 5.7. However, a new MariaDB and MySQL 8.0 feature - histograms - would work better, cleaner and faster. The idea of the talk was born.
Of course, histograms are not a panacea and do not help in all situations.
I will discuss
- how index statistics physically stored by the storage engine
- which data exchanged with the Optimizer
- why it is not enough to make correct index choice
- when histograms can help and when they cannot
- differences between MySQL and MariaDB histograms