October 31, 2014

Slow Query Log analyzes tools

(There is an updated version of this post here). MySQL has simple but quite handy feature – slow query log, which allows you to log all queries which took over define number of seconds to execute. There is also an option to enable logging queries which do not use indexes even if they take less time (–log-queries-not-using-indexes)

Slow query log is great to spot really slow queries which are often good candidates for optimization but it has few serious problems which limits extent to which it is helpful. First – it only allows you to set slow query time in seconds, having 1 second minimum value. For most of interactive applications this is way too large – if you’re developing Web application you probably want whole page to be generated less in 1 second, which issues many queries during generation. Second – if you enable option to log queries which do not use indexes it well can be flooded with fast and efficient queries, which just happen to do full table scans – for example if you would be having drop down list of states in your application and use SELECT * FROM STATES for that it would trigger and log the query.

Taking other Approach

For our clients we often need to find a queries which impact application the most. It does not always have to be slowest queries – query taking 10ms and run 1.000 times per second puts more load on server than 10 seconds query running once per second. We of course want to get rid of really slow queries but to really optimize application throughput queries which generate most of the load need to be investigated

Patching Slow Query Logging – First thing we did is created a Patch which allows you to specify slow query time in microseconds rather than seconds and allows you to log all queries in slow query log by setting long_query_time=0 This patch is adapted version of patch by Georg Richter which was made to run with recent MySQL version. Now why do not we use general log instead ? Unfortunately general logs queries before queries are executed (and even parsed) so it can’t contain query execution information such as execution and lock times and number of rows examined.

After this patch is applied your slow query log starts to look like this:

Filtering Slow Query Log – Especially after the changes to log all queries slow query log may be growing too rapidly to follow, so we implemented slow query log filter (based on parse_mysql_slow_log by Nathanial Hendler) which allows you to filter out only queries which took more than certain amount of time or examined more than certain amount of rows. This is great as allows multiple passes across same slow query log first to fix worse queries and then come to find more optimization candidates. So “tail -f mysql-slow.log | mysql_slow_log_filter -T 0.5 -R 1000” will look at queries as they come and will print out queries taking more than 0.5 seconds to execute or having more than 1000 rows examined.

Aggregating slow query log As I already mentioned besides finding slowest queries it is important to find queries which cause largest load on the server, which is with certain level of accuracy queries which take most time to execute combined. There is a tool mysqldumpslow in MySQL distribution which kind of does the thing – unfortunately being run on slow query log it does not give us information we’re looking for because only slow queries will be looked at. The other problem with this tool is – it replaces all real values with “N”, “S” etc placeholders, which means you can’t simply copy-paste query to run EXPLAIN for it. Using this tool normally require you to keep the other window open and find query sample with real constants which matches query with placeholders to work with it.

So we came op with slow query log parser tool which works with adjusted slow query log format and which gives samples of queries after aggregation. Here is how its output looks like:

As you can see it also prints minimum and maximum execution times so you will be able to see if only in certain cases query takes long time to execute, for example if plan is different based on constants.

How to use this tool set ?

First be aware this patch to MySQL is not official and should be used with caution. We think it is pretty safe but it surely did not get as much battle testing as rest of MySQL Server. Good thing is – you do not have to run patched version all the time. You can just start it for a few hours to generate you query log and get back to unpatched version.

It is best if you generate this log for all your queries with long_query_time=0 so if serious portion of you load comes from very simple queries you would not lose this kind of info. Yes this will reduce your performance a bit and will require plenty of disk space which is another reason you might not wish to run it in this mode all the time. Happily you can change long_query_time without restarting server so it is easy to get sample of all queries for some period of time and then get back to logging only very slow queries.

Once you have created full log – parse it and check queries using EXPLAIN starting from most impacted onces. After you’ve implemented changes – repeat. Changes may help to one queries but hurt others, for example adding indexes often help SELECT queries but slow down INSERT/UPDATE ones.

Final Note: You do not have to have patched MySQL for these utilities to work. they are designed to handle standard slow query log format as well.

Update 2009-03-04: most of us at Percona now use mk-query-digest from Maatkit for slow query log analysis.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Dekorasyon says:

    Tahnk you eleman

  2. Dekorasyon says:

    Hi,

    I did not test with 5.0.18 – but you can try.

  3. Another way to analyse your slow queries is to generate statistics which indexes are used, just like mysql_explain_log from the standard distribution does for the general query log.

    You can get mysql_explain_slow_log on willamowius.de.

  4. Rene Leonhardt says:

    Dear Peter,

    Are you going to port the patch to MySQL 5.0.26?
    That would be really great!

    Keep up the good work,
    René

  5. Vadim says:
  6. Vadim says:
  7. BQ says:

    Firstly, thanks for your great work!

    I used slow query log parser to analyze a slow-query-log yesterday. I seems that all of the result looking good, but I see a very-long query at last. Maybe the script has some bugs on numberic-formating? Could you help me to fix it?

    OUTPUT:

    ### 2846 Queries
    ### Total time: 18446744073826.7, Average time: 6481638817.22652
    ### Taking 0.020004 to 18446744073709.074210.000159 seconds to complete
    ### Rows analyzed 0 – 762

  8. Vadim says:

    Yes, it looks like bug with numbers handling.
    Could you send us part of slow-log which cause such problem ?

  9. Misha Dar says:

    Peter, Vadim

    I’m using MySQL 5.0.18. It’s not quite clear to me – does your patch fits that version.
    Also, can You place somewhere/how more detailed instruction how to apply the patch.

    Thanks,
    M.D.

  10. Vadim says:

    Hi,

    I did not test with 5.0.18 – but you can try.
    You need to place patch into source directory and execute
    patch -p1

  11. Rene Leonhardt says:

    Dear Vadim,

    Thank you very much for the actualised patch file!
    Do you have already one for MySQL 5.0.27?
    It should not be very different, because only one bug has been fixed.
    http://dev.mysql.com/downloads/mysql/5.0.html

    Best regards,
    René

  12. Vadim says:

    Rene,

    I think patch for 5.0.26 should work with 5.0.27 without any problem. I did not test though.
    Will check this week or so.

  13. camka says:

    For Your information, I’ve resently written the similar tool for regular query log to fetch the top of most popular queries.
    You can check it out at http://sourceforge.net/projects/myprofi or at http://myprofi.sourceforge.net for more details.

    This tool wouldn’t give you the execution time based statistics, but outputs an overview of the most frequently run queries, so you could speedup the the overall db performance by optimizing the most popular queries.

    In addition I would like to mention, that Your parser does not take into account all possible string escaping ways, like doubeling quotes, escaping with slash etc. Also would be more useful to group structures like “IN (N,N,N,N,N)” into “IN (N)”.

  14. peter says:

    Thank you Camka,

    It would be great if you make your tool to support slow query log including our format with execution time and show number of examined rows and sent rows.

    Query execution time matters. Query executed most frequently may not be bottleneck.

  15. Rene L. says:

    Dear Peter,

    I reimplemented the mysql_slow_log_parser in PHP in order to provide an even more filtered result.
    If you want me to send it, just comment this entry, because there is no attachment function here.

    # Apply filter to log file once
    php mysql_slow_log_parser.php mysql-slow-queries.log

    Options:
    -T=min_query_time Include only queries which took as long as min_query_time seconds or longer [default: 1]
    -R=min_rows_examined Include only queries which examined min_rows_examined rows or more

    -iu=include_user Include only queries which contain include_user in the user field [multiple]
    -eu=exclude_user Exclude all queries which contain exclude_user in the user field [multiple]
    -iq=include_query Include only queries which contain the string include_query (i.e. database or table name) [multiple]

    –filter-duplicates Output only unique query strings with additional statistics: max_query_time, max_rows_examined, execution count [default sorting: max_query_time, max_rows_examined]

    [multiple] options can be passed more than once to set multiple values.

    Best regards,
    René

  16. peter says:

    Rene,

    You can publish it somewhere and post a link here. If you have a trouble finding place you can send it to us and we can publish it but it is worse as we’ll need to take care of it if you decide to update it etc.

  17. camka says:

    > 16. peter
    Hei, peter,

    I have finally succeeded with impementing the support for sorting by max/average/total execution time of slow queries into MyProfi. But unfortunately i have no chance to try it on patched mysql version, so i tested it just with regular slow query log. You could try the latest version of MyProfi by taking it from project’s download page. Hope it can be useful tool for measuring the database performance on query level.

  18. Rene L. says:

    Hello Peter,

    I created a Google Code project for the MySQL Slow Query Log filter:
    http://code.google.com/p/mysql-log-filter/

    You can access the initial PHP5 script here (requires the PHP extension BCMath):
    http://mysql-log-filter.googlecode.com/svn/trunk/mysql_filter_slow_log.php

    I am planning to provide a Python version, too.

    Please submit any problems or feature requests to the issue tracker on the project page.

  19. Rene L. says:

    Dear Peter,

    I finished the Python version, which is usually 3-5 times faster than the PHP5 version according to my personal tests, depending on the log size.

    Usage Examples:

    # Filter slow queries executed from other users than root for at least 3 seconds, remove duplicates and save result to file
    php mysql_filter_slow_log.php -T=3 -eu=root –no-duplicates mysql-slow-queries.log

    # Start permanent filtering of all slow queries from now on: at least 3 seconds or examining 10000 rows, exclude users root and test
    tail -f -n 0 linux-slow.log | python mysql_filter_slow_log.py -T=3 -R=10000 -eu=root -eu=test &
    # (-n 0 outputs only lines generated after start of tail)

    # Stop permanent filtering
    kill ps auxww | grep 'tail -f -n 0 linux-slow.log' | egrep -v grep | awk '{print 2}'

  20. Rene L. says:

    Unfortunately the comments here do not translate the “greater than” and “less than” symbols on submit.

    Therefore I will use { and } as a replacement for the example above:
    php mysql_filter_slow_log.php -T=3 -eu=root –no-duplicates { linux-slow.log } mysql-slow-queries.log

  21. peter says:

    Thank you Rene,

    I do not think log parser performance is that critical but I expect some people will like python parser more, especially if someone is looking to do some hacking with it.

  22. Nilnandan says:

    Hello,

    I have set long_query_time=5 in my.cnf in mysyl server.
    but although i got those queries in slow-query-log which has query_time=0 second.
    means i got that query which has taken execution time is below 5 seconds.
    How can i solve this issue? Pls help me…its urgent.

    Thanking you,
    Nilnandan Joshi
    DBA
    INDIA

  23. rs says:

    -s=WORD
    what to sort by (t, at, l, al, r, ar etc)

    what’s difference between these option above?

  24. Daniel says:

    Is it possible to trace every SQL statement executed inside a procedure that is called by a client:

    CREATE PROCEDURE foo()
    NOT DETERMINISTIC
    MODIFIES SQL DATA
    COMMENT ‘Generate the required number of random battles’
    BEGIN

    END;

    The problem is that the mysql-slow.log file only contains a trace of the call to the procedure; it does not provide any trace of every subsequent statement executed by the procedure itself:

    # Time: 070607 1:23:04
    # User@Host: dbo[dbo] @ localhost []
    # Query_time: 0 Lock_time: 0 Rows_sent: 0 Rows_examined: 2 SET
    last_insert_id=27274; CALL foo();

    Is there any way to configure mysqld to trace every statement executed by MySQL?

  25. Andrew says:

    Unless I’m missing something, it seems that there is a bug in all of these patches. I’m pretty sure the line in my_time.cc that reads:
    newtime/= (frequency * 1000000);
    instead should be:
    newtime/= (frequency / 1000000);

    Please confirm!

  26. Andrew Schwartz says:

    Another issue:
    In the patch at:
    http://www.mysqlperformanceblog.com/files/patches/patch.slow-micro.5.0.41.diff

    line 2192 should read:
    if ((ulong) (thd->start_timer – thd->timer_after_lock) >

    not:
    if ((ulong) (thd->start_time – thd->time_after_lock) >

  27. Thomas Bühren says:

    Did you think of using a mySQL Proxy script for creating the log file instead of a patch for the mySQL server?

    This would make it much easier to use:
    1. No need to compile the mySQL server.
    2. Logging (use of mySQL Proxy) can be switched on and off by redirecting the port with iptables: http://forge.mysql.com/snippets/view.php?id=82

    The mySQL Proxy script could write a logfile in the slow-log format so that your tool can use it.

  28. peter says:

    MySQL Proxy is good but it can’t give you some information – for example number of rows examined which is quite hanly to have.

    Plus for serious web sites I think minor patch is less intrusive then putting Proxy infront to handle your queries which has unknown overhead and potential side effects.

  29. Mads says:

    Hello,

    Just tried applying the latest patch
    http://www.mysqlperformanceblog.com/files/patches/patch.slow-micro.5.0.41.diff
    to the appropriate sources
    http://downloads.mysql.com/archives.php?p=mysql-5.0&v=5.0.41
    and got
    Error C2143: syntax error : missing ‘)’ before ‘;’ @ c:\mysql-5.0.41-slowmicro\sql-common\my_time.c line 1244
    The fix is – suprisingly :) – to add a ‘)’ before ‘;’ at line 1244 @ my_time.c line 1244, so it is not really a big issue. Still it would be nice if you could update the patch

    Thanks,

    Mads

  30. Vadim says:

    Andrew,

    Thank you for fixes, I applied them in patches!

  31. Vadim says:

    Mads,

    Thanks, I fixed that in the patches.

  32. raptor says:

    hi,

    # User@Host: root[root] @ localhost []
    what does this — ‘[root]‘ signify in the above statement in slow query log?
    Is root@localhost is not enough?

    Thanks and Regards,
    Raptor.

  33. Code4Gold says:

    Thank you so very much for this wonderful article. I have been having an incedible amount of trouble with searches on my SMF forum timing out in Apache because mysql was taking too long to run the search query. After reading this article, I was able to find my problem was that the search query was creating a temp table and populating it with topic ID’s and then building the results off that. After analyzing the long query log, I was able to write a more much more efficient SQL statement to get the job done.

    Again, thank you, thank you, thank you for posting such advanced and useful information.

  34. huma says:

    hi,
    Would you please help me to apply patch.slow-micro.5.0.45.diff?
    Shall I change mysql path in the patch?
    my mysql is here : /usr/local/mysql-5.0.45-linux-i686

    tanx

  35. peter says:

    huma,

    The patch is for source you need to apply it to MySQL 5.0.45 sources and recompile.

  36. Dan says:

    This is a much-needed feature – thanks for sharing it.

    Just so that I understand completely, the patch makes mySQL interpret
    the long_query_time specified to be in microseconds rather than in
    seconds? So if I want to specify 1ms it would be 1000, and not 0.001?

    Thanks
    Dan

  37. peter says:

    Right. It also allows it to be 0 to log all queries with their times.

  38. iddaa says:

    Hello,
    What should be the long_query_time for best performance?
    Regards

  39. GBA says:

    Here’s how i used this patch to benchmark our production server:

    First, I enabled the general query log in my production server, and collected some data. I let it run for 60 hours (2 1/2 days), which captured peak times, and all typical usage.
    I used a trick to turn off the general query log without restarting mysql – i moved the log file aside, touched a replacement, but didn’t give the mysql user write permision to it. I then used mysqladmin to flush all the logs, and when it couldn’t open a handle to the new general query log file, it simply stopped logging.

    Now, get an identical piece of hardware as your production server – preferably built by clonig the HDD so you have identical OS and filesystem layout.
    Recompile the same version of mysql, with this patch applied (we compile mysql from source ourselves anyway, so this is still a fair performance test). Get the database on it current up to the point you turned on the general query log.
    To do this, i used a backup from a few weeks earlier, and used mysqlbinlog to extract the rest of the data from the binary logs, up until the start point. I could actually see the last line in the catchup sql matching one of the first in the general query log, so rewinded one second, and knew i was spot on.
    You now have an identical server, with data in a state ready to be benchmarked.
    It’s worth taking a database dump here, in case you want to run the benchmark again.

    Now pump your query log acquired earlier through the below perl script. It’s very quick, very dirty, and probably very buggy, but seemed to work acceptably well in our case.
    It rebuilds the timing, connections, and queries the server saw, based on the contents of query log – it creates a seperate thread for each connection, and runs each query at (about) the right time, relative to the start of the run.
    That means if you have 60 hours worth of dump data, the script will run for 60 hours replaying it. This was important for us, because we have short bursts of very high load, long periods of quite low load, and even the odd moment of practically zero load. This recreates all of them, just like they were in real life.
    It’s worth running this script on a seperate box with lot’s of cores – it’s very thread-hungry.

    The script (replay_query_log.pl):

    #!perl

    use Time::Local;
    use DBI();

    # Setup who your DB host host, and a hash of copnnection passwords.
    # You need one password entry for each user that connects
    # in the logs you want to replay – you can get by grepping
    # for ‘Connect’, and using cut/sed/sort/uniq liberally.
    #
    my $DBHOST=”slush”;
    my %passwords = ( “user” => “pass”,
    “reporting” => “reporting”,
    “user-no-pw” => “”,
    “etc” => “etc”
    );

    # skip the first 3 lines
    ();
    ();
    ();

    # Process the query log, and build a couple of hashes from it:
    #
    # Hash 1: %queries – keyed by connection id and datetime,
    # – is an array of all queries by that connection at that time.
    # Hash 2: %starttime – keyed by connection id
    # – contains the time the first query run by this conenction started
    #
    # The first hash is used to actually runt he queries, the second is used to
    # avoid fork bombing the box, by only forking the processes when needed.

    while ()
    {
    # datetime line
    if(s/^(0[0-9])([01][0-9])([0-3][0-9])\s+(1?[0-9]):([0-5][0-9]):([0-5][0-9])//)
    {
    # Convert to unix epoch – makes date arithmetic dead easy.
    $datetime = Time::Local::timelocal($6, $5, $4, $3, $2-1, “20”.$1);
    $dt_min = $datetime if (($datetime < $dt_min) || !(defined $dt_min));
    }
    # query type line
    if(s/^\s+([0-9]+) ((Query)|(Quit)|(Connect)|(Statistics)|(Binlog Dump Log))//)
    {
    # If we enter this loop, it means a new query/type has started.
    # Save the old one (if it exists), and reinitialise the vars
    # for the next
    push @{$queries{$connection}{$datetime}}, $query if length($query);
    $connection = $1;
    $starttime{$connection} = $datetime if (($datetime $_)
    {
    sleep $now – $_;
    }
    # Now that the threads are due to start, we fork, and connect
    foreach (@{$startconn{$_}})
    {
    my $conn = $_;
    my $DBUSER = $dbun{$conn}[0];
    my $DBNAME = $dbun{$conn}[1];
    my $DBPASS = $passwords{$DBUSER};
    my $pid = fork();
    if($pid == 0)
    {
    ###print “forked\n”;
    my $dbh;
    # Each connection has multiple queries, so we
    # iterate the hash, waiting for the correct time
    # to run each one.
    # I use while(){ sleep 1 } rather than sleep n to
    # avoid timing issues – this garauntees the query will
    # run within +/- 0.5s of when it’s due, which is good
    # enough for our purposes (usually).
    foreach (sort keys %{$queries{$conn}})
    {
    my $dt = $_;
    while((time() – $offset) connect(“DBI:mysql:database=$DBNAME;host=$DBHOST”,
    “$DBUSER”, “$DBPASS”)
    || die “Couldn’t create dbh”;
    }
    foreach (@{$queries{$conn}{$_}})
    {
    my $sth = $dbh->prepare($_);
    $sth->execute();
    $sth->finish();
    }
    }
    $dbh->disconnect();
    ###print “died\n”;
    exit; # very important.
    } else
    {
    # Don’t do this.
    #waitpid($pid,0);
    }
    }
    }

    # done!!!

    I got some quite interesting numbers our of it, and have learned a lot about how our database behaves.

    Now – this is not a well tested script – it’s probably rife with bugs, so use with caution. Any bugfixes, enhancements, or comments – please post them back here.

    Hopefully this will help some-one else out.

  40. Ilkka says:

    Hi,

    Is it possible to update the patch to 5.0.61? Thanks.

  41. peter says:

    There is patch for 5.0.62
    http://www.mysqlperformanceblog.com/mysql-patches/

    We also have patch in work for 5.0.67

    If you need patch for 5.0.61 or any other version we can do backport on consulting basics.

  42. ligtv izle says:

    Hi,

    Is it possible to update the patch to 5.0.61? Thanks

    ^^

  43. peter says:

    Please check the current releases and patches for them at:
    http://www.percona.com/percona-lab.html

    If you need patch for particular version we can do backport on consulting basics.

  44. Eric says:

    Hello,
    The function of mysql slow log filter is more attractive,so I just installed Fedora 9 and MySQL 5.1.29. Unfortunately, I am newbie in Linux and Perl. Although I have installed Perl5.10 by yum install,when I download the mysql_slow_log_filter file and issue the command “tail -f mysql-slow.log | mysql_slow_log_filter -T 0.5 -R 1000″, the result is always “bash: mysql_slow_log_filter: command not found”. How can I install it? Thanks.

  45. Vadim says:

    Eric,

    you may want to make mysql_slow_log_filter executable:

    chmod a+x mysql_slow_log_filter

  46. Thanks for the slow query log parser tool,
    I have easily found the expensive queries and created index accordingly, nice and easy!
    I did not patch the server anyway(production)

    The problem I have is to create an index on a 7gb InnoDB table , single .ibd file, that locks the table WRITE for
    a hour that is not applicable for a production server, is there any trick?
    I have an idea that can change the lock from WRITE to READ.

    In the lowest load window (2:00-4:00 am)

    [1] Lock the table READ to prevent updates and make the new index consistent
    [2] create like a new empty table and add the new index
    [3] select into the new table from the big table
    [4] lock write tables
    [5] swap tables

    do you think is a good strategy?

    Thank You!

    Claudio Nanni

  47. Hello,

    I try this but ı wonder.

    What should be the long_query_time for best performance?

    Thanks

  48. peter says:

    Hi,

    We set it to Zero but only enable logging when we need to analyze the data. If you’re setting it to different number you may be missing some very frequent fast queries which may be often cause of large portion of system load.

  49. Ragor says:

    I dont understand that. So sometimes I can reach any data for that codes. We can argue this with together.

  50. hello

    it s possıble 5.0.65 ty

  51. Dekorasyon says:

    Hi,

    I did not test with 5.0.18 – but you can try.

  52. Dekorasyon says:

    Tahnk you eleman

  53. jayaram says:

    # Query_time: 18446744073709.550781 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0

    What is the query_time actually, Is this in sec.

  54. Yes,
    given that you have a query that ran for 584942 years :)
    It must be some sort of overflow, a bug anyway.

    Claudio

  55. jayaram says:

    Thank you very much !
    But how can I found the actual slow queries, In my slow.log file all Query_time values are like above only.

  56. Diego says:

    jayaram: Is your time set correctly? Have you changed since mysql is running?

  57. @jayaram something is definitely wrong. More info is needed, you can also post this topic to the mysql mailing list: mysql@lists.mysql.com if you want.

  58. That’s a bug in an old version of Percona Server. The fix is to upgrade.

  59. Sandeep says:

    I am using AWS RDS, have set long_query_time to integer 5000.

    Table slow_log has all queries with query_time having values less than 5 seconds.
    Due to this slow_log table is growing like hell.

    Please help ….

  60. jayaram says:

    Which version of MySQL-Server is best and more efficient for Master-Master and Master-Slave Replication.
    Present I’m using 5.0.77-log.

  61. Bill Gates says:

    Is there anyway to display entries from the slow log in a particular date and time range? I have tried using grep, but have been unsuccessful.

  62. mikeytown2 says:

    Latest Slow Query Analyzer: http://www.percona.com/doc/percona-toolkit/2.2/pt-query-digest.html

    wget percona.com/get/pt-query-digest
    chmod +x pt-query-digest

Speak Your Mind

*