Introduction to Troubleshooting Performance – Troubleshooting Slow Queries webinar: Q & A

PREVIOUS POST
NEXT POST

Troubleshooting Slow QueriesIn this blog, I will provide answers to the Q & A for the Troubleshooting Slow Queries webinar.

First, I want to thank you for attending the April 28 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: I’ve heard that is a bad idea to use select *; what do you recommend?

A: When I used SELECT * in my slides, I wanted to underline the idea that sometimes you need to select all columns from the table. There is nothing bad about it if you need them. SELECT * is bad when you need only a few columns from the table. In this case, you retrieve more data than needed, which affects performance. Another issue that   SELECT * can cause is if you hard-code the statement into your application, then change table definition; the application could start retrieving columns in wrong order and output (e.g., email instead of billing address). Or even worse, it will try to access a non-existent index in the result set array. The best practice is to explicitly enumerate all columns that your application needs.

Q: I heard that using  index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?

A: I assume you are asking about the ability to create an index with lengths smaller than the column length? They work as follows:

Assume you have a  TEXT  field which contains these user questions:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using  index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?
  3. ….

Since this is a  TEXT  field you cannot create and index on it without specifying its length, so you need to make the index as minimal as possible to uniquely identify questions. If you create an index with length 10 it will contain:

  1. I’ve heard
  2. I heard th

You will index only those parts of questions that are not very distinct from each other, and do not contain useful information about what the question is. You can create index of length 255:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as index

In this case, the index includes the whole first question and almost all the second question. This makes the index too large and requires us to use more disk space (which causes more IO). Also, information from the second question is probably too much.

If make index of length 75, we will have:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle du

This is more than enough for the first question and gives a good idea of what is in the second question. It also potentially will have enough unique entries to make its cardinality look more like the cardinality of real data distribution.

To conclude: choosing the correct index length is something that requires practice and analysis of your actual data. Try to make them as short as possible, but long enough so that the number of unique entries in the index will be similar to a number of unique entries in the table.

Q: Which view can we query to see stats?

A: Do you mean index statistics? SHOW INDEX FROM table_name will do it.

Q: We have an InnoDB table with 47 fields (mostly text); some are ft-indexed. I tried to do an alter table, and it ran for 24 hours. What is the best way to run an alter table to add one extra field? The table has 1.9 M rows and 47 columns with many indexes.

A: Adding a column requires a table copy. Therefore, the speed of this operation depends on the table size and speed of your disk. If you are using version 5.6 and later, adding a column would not block parallel queries (and therefore is not a big deal). If you are using an older version, you can always use the pt-online-schema-change utility from Percona Toolkit. However, it will run even more slowly than the regular ALTER TABLE. Unfortunately, you cannot speed up the execution of ALTER TABLE much. The only thing that you can do is to use a faster disk (with options, tuned to explore speed of the disk).

However, if you do not want to have this increased IO affect the production server, you can alter the table on the separate instance, then copy tablespace to production and then apply all changes to the original table from the binary logs. The steps will be something like:

  1. Ensure you use option innodb_file_per_table  and the big table has individual tablespace
  2. Ensure that binary log is enabled
  3. Start a new server (you can also use an existent stand-by slave).
  4. Disable writes to the table
  5. Record the binary log position
  6. Copy the tablespace to the new server as described here.
  7. Enable writes on the production server
  8. Run ALTER TABLE on the new server you created in step 2 (it will still take 24 hours)
  9. Stop writes to the table on the production server
  10. Copy the tablespace, altered in step 7
  11. Apply all writes to this table, which are in the binary logs after position, recorded in step 4.
  12. Enable writes to the table

This scenario will take even more time overall, but will have minimal impact on the production server

Q: If there is a compound index like index1(emp_id,date), will the following query be able to use index? “select * from table1 where emp_id = 10”

A: Yes. At least it should.

Q: Are  filesort and temporary in extended info for explain not good?

A: Regarding filesort: it depends. For example, you will always have the word filesort” for tables which perform  ORDER BY and cannot use an index for ORDER BY. This is not always bad. For example, in this query:

the primary key used to resolve rows and filesort were necessary and not avoidable. You can read about different filesort algorithms here.

Regarding Using temporary: this means what during query execution temporary table will be created. This is can be not good, especially if the temporary table is large and cannot fit into memory. In this case, it would be written to disk and slow down operations. But, again, sometimes creating temporary tables in not avoidable, for example, if you have both GROUP BY and ORDER BY clauses which list columns differently as stated in the user manual.

Q: Is key_len length more of a good thing for query execution?

A: key_len field is not NULL for all queries that use and index, and just shows the length of the key part used. It is not good or bad, it is just for information. You can use this information, for example, to identify which part of combined index is used to resolve the query.

Q: Does an alter query go for an optimizer check?

A: No. You can check it either by enabling optimizer trace, running ALTER and find what trace is empty. Or by enabling the debug option and searching the resulting trace for optimize.

Q: A query involves four columns that are all individually covered by an index. The optimizer didn’t merge indexes because of cost, and even didn’t choose the composite index I created.

A: This depends on the table definition and query you used. I cannot provide a more detailed answer based only on this information.

Q cont.: Finally, only certain composite indexes were suitable, the column order in the complex index mattered a lot. Why couldn’t the optimizer merge the four individual single column indexes, and why did the order of the columns in the composite index matter?

A: Regarding why the optimizer could not merge four indexes, I need to see how the table is defined and which data is in these indexed columns. Regarding why the order of the columns in the composite index matters, it depends on the query. Why the optimizer can use an index, say, on (col1, col2) where the conditions col1=X AND col2=Y and col2=Y AND col2=X for the case when you use OR, the order is important. For example, for the condition col1=X OR col2=Y, where the part col1=X is always executed and the part col2=Y  is executed only when col1=X is false. The same logic applies to queries like SELECT col1 WHERE col2=Y ORDER BY col3. See the user manual for details.

Q: When I try to obtain the optimizer trace on the console, the result is cut off. Even if I redirect the output to a file, how to overcome that?

A: Which version of MySQL Server do you use? The  TRACE column is defined as longtext NOT NULL, and should not cause such issues. If it does with a newer version, report a bug at http://bugs.mysql.com/.

Q: Are there any free graphical visualizers for either EXPLAIN or the optimizer TRACE output?

A: There is graphical visualizer for EXPLAIN in MySQL Workbench. But it works with online data only: you cannot run it on EXPLAIN output, saved into a file. I don’t know about any visualizer for the optimizer TRACE output. However, since it is JSON you can simply save it to file and open in web browser. It will allow a better view than if opened in simple text editor.

Q: When do you use force index instead of use index hints?

A: According to user manual “ USE INDEX (index_list) hint tells MySQL to use only one of the named indexes to find rows in the table” and “ FORCE INDEX  hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive . . . a table scan is used only if there is no way to use one of the named indexes to find rows in the table.” This means that when you use USE INDEX, you are giving a hint for the optimizer to prefer a particular index to others, but not enforcing index usage if the optimizer prefers a table scan, while FORCE INDEX requires using the index. I myself use only FORCE and IGNORE  index hints.

Q: Very informative session. I missed the beginning part. Are you going to distribute the recoded session later?

A: Yes. As usual slides and recording available here.

PREVIOUS POST
NEXT POST

Share this post

Comments (3)

  • Paul Reply

    I love these Q&A articles, they’re really informative and make great casual reading (to stay up to speed without a specific task in mind)

    I guess it takes a bit of time to format the Webinar chats into a nice article, so thankyou for spending time publishing them 🙂

    May 20, 2016 at 9:12 pm
  • Øystein Grøvlen Reply

    Q: When I try to obtain the optimizer trace on the console, the result is cut off. Even if I redirect the output to a file, how to overcome that?

    There is a configurable maximum size for the memory buffer used to record the trace. The default is pretty low, and you will often have to increase the size to capture the entire trace. If the trace is cut off, you should increase the setting of the variable optimizer_trace_max_mem_size. See http://oysteing.blogspot.no/2016/01/how-to-get-optimizer-trace-for-query.html for more details.

    May 21, 2016 at 12:44 am
  • Lynea Reply

    Are there any articles that you might recommend to someone new to MySQL tuning? My background is in T-SQL (Microsoft Development), and it works really by rules. MySQL almost seems passive-aggressive when it comes to choosing when to use an index and when to bypass it and do an “ALL” type, full-table scan. MySQL says, “I see you are asking to use this index on this join, but I prefer to do it this way.” Yes, there are situations where that will happen in MS SQL, but those are obvious things (e.g.: differing datatypes, etc.) MySQL has it’s own hidden rules, I’ve noticed. I’ve attempted to put together a comprehensive list of things that might cause this to happen, but it is far from complete.
    Thanks, Sveta!

    January 30, 2017 at 4:05 pm

Leave a Reply