I’ve got an interesting comment the other day saying “I’ve heard Jay Pipes saying indexes with cardinality lower than 30% are worthless, true?”
That is interesting question and it has different answers depending on who is asking. A lot of people want to hear simple answers to the questions because they would be overwhelmed by complete answer, it is like children being taught math first taught simple division skills like you can divide 10 by 2 but not by 3. As we grow up we learn the truth is much more complex than that and if you Major in math you will find there is a lot of stuff out where far beyond basic stuff we use in every day life.
Answering such question you can say yes it is true and for clueless person it will be helpful answer because it will allow him to avoid creation indexes on the column containing just one value which is good idea.
If you start to explain it in details, a lot of variables add in – covering indexes, skewed distribution vs lookup frequency, IO bound or CPU bound workload which may be far too complicated.
It is also interesting the broad answer would not be helpful – there are workloads when even with 1% matching rows you’re better of doing full table scans, and there are ones with 99% same values in the column having it indexed helps.
The other questions which people like to get quick answers for are for example “Which MySQL Storage Engine to Use” or “How Much Memory Compared to Your Database Size Should you Have” , “What Key Cache Hit Ratio is Good One”
This often gives me head aches – I like correctness in the answers but it is often too much to deal with, so I’m trying to keep the answers tailored towards customer needs omitting details which should not be important and making sure all relevant stuff is covered.
This is one of the reasons of hiring MySQL Expert – you typically can find same information in the books, Internet, and even in this blog but selecting what is relevant for your case might be complicated.