How simple answer are you looking for ?

I’ve got an interesting comment the other day saying “I’ve heard Jay Pipes saying indexes with cardinality lower than 30% are worthless, true?”

That is interesting question and it has different answers depending on who is asking. A lot of people want to hear simple answers to the questions because they would be overwhelmed by complete answer, it is like children being taught math first taught simple division skills like you can divide 10 by 2 but not by 3. As we grow up we learn the truth is much more complex than that and if you Major in math you will find there is a lot of stuff out where far beyond basic stuff we use in every day life.

Answering such question you can say yes it is true and for clueless person it will be helpful answer because it will allow him to avoid creation indexes on the column containing just one value which is good idea.

If you start to explain it in details, a lot of variables add in – covering indexes, skewed distribution vs lookup frequency, IO bound or CPU bound workload which may be far too complicated.

It is also interesting the broad answer would not be helpful – there are workloads when even with 1% matching rows you’re better of doing full table scans, and there are ones with 99% same values in the column having it indexed helps.

The other questions which people like to get quick answers for are for example “Which MySQL Storage Engine to Use” or “How Much Memory Compared to Your Database Size Should you Have” , “What Key Cache Hit Ratio is Good One”

This often gives me head aches – I like correctness in the answers but it is often too much to deal with, so I’m trying to keep the answers tailored towards customer needs omitting details which should not be important and making sure all relevant stuff is covered.

This is one of the reasons of hiring MySQL Expert – you typically can find same information in the books, Internet, and even in this blog but selecting what is relevant for your case might be complicated.

4 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Jay Pipes

18 years ago

Uh, I would never say “indexes with cardinality lower than 30% are worthless”. 😉 What I have said in the past is that if you have a situation where the value you are searching for is contained in 30% of an index’s records, it is highly unlikely the index will be used, as the efficiency of a sequential scan will outweigh the numerous bookmark lookup operations needed with the index seek operations. Of course, having a covering index can influence the optimizer’s decision of course, as will having multiple levels of cardinality (multiple columns in the index). Bottom line: I *never* say there is a simple answer or a single answer to anything. Just take a look at my latest set of slides… on slide 2, I say “The answer to every question is… it depends.”

Cheers, and see you soon at the conference, Peter!

Peter Zaitsev

Author

18 years ago

Jay,

I’m not saying you’re saying that. I’m simply quoting the guy posted comment on this blog 🙂

I just thought that would be good way to start the post being nice teaser.

The point is not what you’re giving people simple answers to complex questions but what a lot of people are looking for simple answers because it makes them comfortable.

Bob Bankay

18 years ago

We have a large table (10M row) with about 12 indices. One of then indicates that state of the row in the table and there are about 30 states. The states change from 0 to 30 in about 14 days.

Would it be appropriate to create views for state with the highest population. Would it be faster than a lookup for the specific state wiht LIMIT n?

Peter Zaitsev

Author

18 years ago

Bob,

Please ask questions which are not related to the topic in our forums instead.

The answer to your question would be – creating view is not going to help as view does not carry any of its own indexes or other special structures.

MySQL 5.7
Support

Compare Percona to Leading Database Solutions

Software
Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How simple answer are you looking for ?

Related Blog Articles

RECOMMENDED ARTICLES

Semantic Caching for LLM Apps: Reduce Costs by 40-80% and Speed up by 250x

Importance of Tuning Checkpoint in PostgreSQL

Databases, Data Lakes, And Encryption

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7 Support

Compare Percona to Leading Database Solutions

Software Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How simple answer are you looking for ?

About the Author

Share This Post!

Stay up to date with the Percona Blog

Related Blog Articles

RECOMMENDED ARTICLES

Semantic Caching for LLM Apps: Reduce Costs by 40-80% and Speed up by 250x

Importance of Tuning Checkpoint in PostgreSQL

Databases, Data Lakes, And Encryption

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7
Support

Software
Downloads