Announcement Module
No announcement yet.

yet another query performance issue (maybe related to subselect?)

Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • yet another query performance issue (maybe related to subselect?)

    Hello everyone, I'm having a lot of problem with this query that runs much too slow.

    The table contains about 120 000 records and the query is returning 150 results.
    Removing the distinct does not really help increase the speed and without it the query returns 1500 results.

    The query takes about 10 seconds to execute.
    Strange thing is that increasing the available ram (the mysql server is running on a vmware server) from 1Gb to 3Gb the query was even more slow (taking about 25 seconds).

    I have tried many different index without good results.

    At this point I'm not sure if the problem is with the query or the MySQL server configuration. Mysql version is 4.1.20.

    Any help is greatly appreciated,

    ------------------------------------------------------------ ----
    select distinct q1.timestamp as timestamp,
    q1.agent_channel as agent_channel,
    q1.event as event, q2.event_parameter1 as agent
    from queue_logs q1, queue_logs q2
    where (q1.timestamp >= '1184413563' and
    q1.timestamp < '1184499963' and
    q2.timestamp >= '1184413563' and
    q2.timestamp < '1184499963') and
    (q1.event = 'PAUSE' or
    q1.event = 'UNPAUSE' or
    q1.event = 'AGENTLOGOFF') and
    q1.agent_channel = q2.agent_channel and
    q2.event = 'AGENTLOGIN' and
    q2.timestamp =

    select max(q3.timestamp)
    from queue_logs q3
    where q3.timestamp >= '1184413563' and
    q3.timestamp < '1184499963' and
    q3.timestamp < q1.timestamp and
    q3.event = 'AGENTLOGIN' and
    q3.agent_channel = q1.agent_channel
    or (q1.event = 'AGENTLOGIN' and =

    ------------------------------------------------------------ ----

    id 1
    select_type PRIMARY
    table q1
    type range
    possible_keys PRIMARY,ql_t_index,ql_eat_index,ql_e_index,ql_ea_i ndex
    key ql_t_index
    key_len 4
    ref NULL
    rows 1996
    Extra Using where; Using temporary

    id 1
    select_type PRIMARY
    table q2
    type range
    possible_keys PRIMARY,ql_t_index,ql_eat_index,ql_e_index,ql_ea_i ndex ql_t_index
    key ql_t_index
    key_len 4
    ref NULL
    rows 1996
    Extra Using where

    id 2
    select_type DEPENDENT SUBQUERY
    table q3
    type range
    possible_keys ql_t_index,ql_eat_index,ql_e_index,ql_ea_index ql_e_index
    key ql_e_index
    key_len 56
    ref NULL
    rows 810
    Extra Using where

    ------------------------------------------------------------ ----------------------------------------------

    CREATE TABLE `queue_logs` (
    `id` int(11) NOT NULL auto_increment,
    `timestamp` int(10) unsigned NOT NULL default '0',
    `call_identifier` tinytext,
    `queue_name` tinytext,
    `agent_channel` tinytext,
    `event` text NOT NULL,
    `event_parameter1` tinytext,
    `event_parameter2` tinytext,
    `event_parameter3` tinytext,
    PRIMARY KEY (`id`),
    KEY `ql_t_index` (`timestamp`),
    KEY `ql_eat_index` (`event`(18),`agent_channel`(12),`timestamp`),
    KEY `ql_e_index` (`event`(18)),
    KEY `ql_ea_index` (`event`(18),`agent_channel`(12))

  • #2
    Your problem is definitely the query.

    I'm not exactly sure what your query are returning but whenever you have a query that uses DISTINCT and it returns 10 times as many rows when you remove it it is not good and can usually be written in a much better way.

    Right now it looks like you are performing a cross join between q1 and q2 since you only have the two timestamp scopes as conditions.

    My suggestions is that:
    Rewrite the query using the better JOIN syntax:

    ...FROM tableA[INNER|LEFT] -- Whichever you need JOIN tableB

    Because it makes it so much easier to read.

    Rethink what your purpose with this query is and rewrite it.
    Because I don't think that you will be able to get any decent speed out of this one.


    • #3
      Thanks for the hint,

      I'll see how if can rewrite the query using joins and go from there. The query is pretty easy to understand, just hard to get it done with my actual sql knowledge )

      the idea behind this was that the table contains that kind of data :

      id","timestamp","agent_channel","event","event_par ameter1 "

      and I'd like the query to replace all NULL event_parameter1 values with the event_parameter1 value of the first AGENTLOGIN event for each_channel found :

      "id","timestamp","agent_channel","event","event_pa rameter1 "



      • #4
        OK here's a my suggestion:

        I changed a bit in your table layout and indexes (but it should be changed a bit more, read below):

        CREATE TABLE `queue_logs` (`id` int(11) NOT NULL auto_increment,`timestamp` int(10) unsigned NOT NULL default '0',`call_identifier` tinytext,`queue_name` tinytext,`agent_channel` VARCHAR(20),`event` VARCHAR(20) NOT NULL,`event_parameter1` tinytext,`event_parameter2` tinytext,`event_parameter3` tinytext,PRIMARY KEY (`id`),KEY `ql_ix_timestamp` (`timestamp`),KEY `ql_ix_ev_ac_ts_id` (`event`,`agent_channel`,`timestamp`, id),KEY `ql_e_index` (id, `event`)) ENGINE=MyISAM;

        Reason is that "text" columns and BLOB columns are treated differently internally compared to CHAR or VARCHAR columns.

        Then my suggestion for your query:

        SELECT q1.timestamp as timestamp, q1.agent_channel as agent_channel, q1.event as event, IF( q1.event_parameter1 IS NULL OR q1.event_parameter1 = '', ( SELECT q2.event_parameter1 FROM queue_logs q2 WHERE = ( SELECT MAX( FROM queue_logs q3 WHERE q3.agent_channel = q1.agent_channel AND q3.event = 'AGENTLOGIN' AND q3.timestamp < q1.timestamp ) ), q1.event_parameter1 ) AS event_parameter1FROM queue_logs q1WHERE q1.timestamp >= 1195650303 AND q1.timestamp < 1195657500ORDER BY `timestamp`

        This query should solve your speed problem.

        BTW 1: You should avoid quotes around INT values like the timestamp column. This to avoid unnecessary implicit type conversionts. A numeric value with quotes around it is actually a string.

        BTW 2: you should avoid using column names like timestamp since it is a reserved word in mysql.


        • #5

          thank you very much for the suggestions, that is very much appreciated. I'll try that ASAP. I was able to come with a more elegant and faster query using joins (with some help) but yours might be even better. The one I have right now could still be faster when querying for data for date intervals greater than 2 weeks.

          As for having a column named timestamp, I know this is a bad move, I'll get around to changing that when the application is ready, I just couldnt think of a better named for that yet

          Thank you very much,