Clustering indexes vs. Covering indexes

May 29, 2009
Author
Zardosht.Kasheff
Share this Post:

Yesterday, I (Zardosht) posted an entry introducing clustering indexes. Here, I elaborate on three differences between a clustering index and a covering index:

  1. Clustering indexes can create indexes that would otherwise bounce up against the limits on the maximum length and maximum number of columns in a MySQL index.
  2. Clustering indexes simplify syntax making them easier and more intuitive to use.
  3. Clustering indexes have smaller key sizes leading to better performance.

Expanding MySQL’s Limits

MySQL allows at most 16 columns in an index and at most 3072 bytes per index. For tables that have more than 16 columns or a row size of greater than 3072 bytes, one cannot create a covering index that includes all of the fields. Suppose one had the following table with 26 columns:

In standard MySQL, one cannot create a covering index for the query:

However, one can create the following clustering index to speed up the query as:

Syntactic simplification

The intent of a clustering index definition is easier to understand because it only includes columns necessary to achieve the desired sort order. Columns needed to satisfy the select clause can be left out. As users add more indexes to speed up more queries, simplifying schema definitions becomes increasingly important.

Consider the table:

Now suppose you want to speed up this query:

To create a covering index for this query you do:

To create a clustering index for this you do:

Another example of the simplification shows up when want to add a column. With clustering indexes you can write

and you don’t need to change the index definition. With a covering index, you need to:

If you don’t drop and re-add the covering index, it won’t cover the new column. Covering indexes have similar problems if you drop a column.

Smaller key sizes

The key length of a clustering index is much smaller than that of a covering index. Smaller key lengths have performance benefits. Inserting into a clustering index is often faster than inserting into an equivalent covering index. Some other (non-MySQL) databases provide syntax to include nonkey columns in an index. One can think of clustering indexes as an extreme case, in which all the columns are added as nonkey columns. As Kristian Nielsen phrased it the comments of the earlier posting, the extra fields are stored only in the leaves of a B-Tree, not in the internal nodes. That can be useful if the row size is much larger than the key size. (Although Fractal Tree® indexes aren’t exactly organized like B-trees, this insight is essentially correct.)

Tradeoffs for TokuDB vs. other storage engines

Maintaining clustering indexes has a similar cost to maintaining covering indexes. Specifically, data gets inserted into multiple places. This is the tradeoff for many storage engines: to use clustering indexes to get faster queries, one must use more I/O operations and extra disk storage.

For TokuDB these tradeoffs are a lot easier to make. TokuDB can insert data an order of magnitude faster than other storage engines, while maintaining indexes. Percona measured our insertion rate at 10.6x faster than InnoDB’s, and measured our database size at 5x to 6x smaller than InnoDB and MyISAM. Thus, the tradeoff is more manageable for TokuDB than it is for other storage engines. This was the motivation for implementing the clustering index feature.

Conclusion

Clustering indexes provide the benefits of covering indexes in an easy-to-use manner. Clustering indexes offer immediate performance gains that are difficult or impossible to achieve with covering indexes. TokuDB can maintain indexes while inserting data an order of magnitude faster than other storage engines, and compresses the data, making clustering keys really valuable.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved