Billion Goods in Few Categories: how Histograms Save a Life?

We store data with the intention to use it: search, retrieve, group, sort… To perform these actions effectively MySQL storage engines index data and communicate statistics with the Optimizer when it compiles a query execution plan. This approach works perfectly well unless your data distribution is uneven. Last year I worked on several tickets where data followed the same pattern: millions of popular products fit into a couple of categories and the rest used the rest. We had a hard time finding a solution for retrieving goods fast. Workarounds for version 5.7 were offered. However, we learned a new MySQL 8.0 feature – histograms – would work better, cleaner and faster. Thus, the idea of our talk was born. In this webinar, we will discuss: – How index statistics are physically stored – Which data exchanged with the Optimizer – Why it is not enough to make a correct index choice In the end, I will explain which issues are resolved by histograms and why using index statistics are insufficient for the fast retrieving of unevenly distributed data.

Download slides

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved