Another look at improving TPC-H-like queries – Q17

Another look at improving TPC-H-like queries – Q17


Summary: An alternate approach, offered in response to our original post, provides excellent improvements for smaller databases, but clustered indexes offer better performance as database size increases. (This posting is by Dave.)

Jay Pipes suggested an alternate approach to improving MySQL performance of Query 17 on a TPC-H-like database.

  1. Add the index (l_partkey, l_quantity) to the lineitem table.
  2. Re-write the query as:

I ran this suggestion on MySQL with Tokutek’s TokuDB storage engine (as before) against a 10G Scale Factor database. After waiting several hours for the EXPLAIN to complete, I punted and moved to a smaller, 1G Scale Factor database. This database nearly fits in memory, and allowed me to examine the query plan

Create the index:


Notice that although the planner sees the li_pkey_quan_idx, it does not use it.

Run the Query:

The resulting query time (66.73s) is an improvement on the default query (155.83s). Nonetheless, when you run this query against the larger, 10G Scale Factor (SF) database, it does not complete after several hours. I observed that the intermediate tables can be held in memory for SF = 1G, but are written to disk for SF = 10G. This is a well-known peril of intermediate tables.

Undeterred, I tried forcing the query planner to make use of the li_pkey_quan_idx. I found that telling the planner to use the index in both the 3rd and 4th line of plan, I could get substantial speed-ups:



Now we’re talking. 12.38s vs. 155.83s (12x improvement). I’m sure this is closer to what Jay was hoping for. Emboldened, I ran the modified query against the SF = 10G database:

So Jay’s ideas offer a solution that does not involve clustering keys and produces a result in a manageable timeframe. However, relative to clustering keys, it is still 4028 / 101 = 40x slower.


Share this post

Comments (4)

  • Roland Bouman Reply

    Hi ppl,

    “An alternate approach, offered in response to our original post,”

    Do you mean this:


    June 24, 2009 at 6:19 am
  • Dave Reply


    Yes – that’s what I meant.


    June 24, 2009 at 12:52 pm
  • Jay Pipes Reply

    Hi Dave!

    Great post! Yes, that’s around the speedup I expected. It’s a shame that you have to nudge the optimizer into using the compound index, but I’m glad you persevered and added the index hint 🙂

    Cheers, and thanks for the post!


    June 24, 2009 at 4:09 pm
  • Jimi Sanchez Reply

    I know this is an old blog post, just letting you know there is an ‘AND’ missing in the first query.

    I really enjoy your blog posts btw!

    November 16, 2015 at 2:15 pm

Leave a Reply