October 25, 2014

How simple answer are you looking for ?

I’ve got an interesting comment the other day saying “I’ve heard Jay Pipes saying indexes with cardinality lower than 30% are worthless, true?”

That is interesting question and it has different answers depending on who is asking. A lot of people want to hear simple answers to the questions because they would be overwhelmed by complete answer, it is like children being taught math first taught simple division skills like you can divide 10 by 2 but not by 3. As we grow up we learn the truth is much more complex than that and if you Major in math you will find there is a lot of stuff out where far beyond basic stuff we use in every day life.

Answering such question you can say yes it is true and for clueless person it will be helpful answer because it will allow him to avoid creation indexes on the column containing just one value which is good idea.

If you start to explain it in details, a lot of variables add in – covering indexes, skewed distribution vs lookup frequency, IO bound or CPU bound workload which may be far too complicated.

It is also interesting the broad answer would not be helpful – there are workloads when even with 1% matching rows you’re better of doing full table scans, and there are ones with 99% same values in the column having it indexed helps.

The other questions which people like to get quick answers for are for example “Which MySQL Storage Engine to Use” or “How Much Memory Compared to Your Database Size Should you Have” , “What Key Cache Hit Ratio is Good One”

This often gives me head aches – I like correctness in the answers but it is often too much to deal with, so I’m trying to keep the answers tailored towards customer needs omitting details which should not be important and making sure all relevant stuff is covered.

This is one of the reasons of hiring MySQL Expert – you typically can find same information in the books, Internet, and even in this blog but selecting what is relevant for your case might be complicated.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Jay Pipes says:

    Uh, I would never say “indexes with cardinality lower than 30% are worthless”. ;) What I have said in the past is that if you have a situation where the value you are searching for is contained in 30% of an index’s records, it is highly unlikely the index will be used, as the efficiency of a sequential scan will outweigh the numerous bookmark lookup operations needed with the index seek operations. Of course, having a covering index can influence the optimizer’s decision of course, as will having multiple levels of cardinality (multiple columns in the index). Bottom line: I *never* say there is a simple answer or a single answer to anything. Just take a look at my latest set of slides… on slide 2, I say “The answer to every question is… it depends.”

    Cheers, and see you soon at the conference, Peter!

  2. peter says:

    Jay,

    I’m not saying you’re saying that. I’m simply quoting the guy posted comment on this blog :)

    I just thought that would be good way to start the post being nice teaser.

    The point is not what you’re giving people simple answers to complex questions but what a lot of people are looking for simple answers because it makes them comfortable.

  3. Bob Bankay says:

    We have a large table (10M row) with about 12 indices. One of then indicates that state of the row in the table and there are about 30 states. The states change from 0 to 30 in about 14 days.

    Would it be appropriate to create views for state with the highest population. Would it be faster than a lookup for the specific state wiht LIMIT n?

  4. peter says:

    Bob,

    Please ask questions which are not related to the topic in our forums instead.

    The answer to your question would be – creating view is not going to help as view does not carry any of its own indexes or other special structures.

Speak Your Mind

*