Announcement

Announcement Module
Collapse
No announcement yet.

JOIN & GROUP BY Optimization problem

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • JOIN & GROUP BY Optimization problem

    We have three tables(datacount, resourcetable, daterange) with one of them(count) having million+ rows in it. Resourcetable will have around 100 rows and daterange arnd 10.

    We want to get the sum(cnt) from count table grouped by values in resource and/or daterange;

    The structure is as follows :

    create table resourcetable(resourceid char(36), resourcename varchar(255)) engine=InnoDB;
    create table datacount( cnt integer, resourceid char(36), cntdate timestamp) engine=InnoDB;
    create table daterange(datemin date, datemax date) engine=InnoDB;


    We tried to optimize the performance by creating index on resourceid, cntdate and both of them but it doesn't work.

    All the queries are taking more than 30 seconds :

    1. select sum(cnt), datemin, datemax from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like '% 1%' group by datemin, datemax;

    2. select sum(cnt), datemin, datemax, resourcetable.resourceid from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like '% 1%'
    group by datemin, datemax, resourcetable.resourceid;

    3. select sum(cnt), resourcetable.resourceid from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like '% 1%'
    group by resourcetable.resourceid;


    Scripts to insert data in tables :
    (You can change the rowfactor in the INSERTCOUNT procedure to insert more rows. As of now it will insert around 200K rows)

    insert into daterange values('2008-04-01','2008-04-05');
    insert into daterange values('2008-04-06','2008-04-10');
    insert into daterange values('2008-04-11','2008-04-15');
    insert into daterange values('2008-04-16','2008-04-20');
    insert into daterange values('2008-04-21','2008-04-25');
    insert into daterange values('2008-04-26','2008-04-30');

    delimiter //
    CREATE PROCEDURE INSERTRESOURCE()
    BEGIN
    DECLARE itr INT DEFAULT 10;

    WHILE itr < 99 DO
    INSERT INTO resourcetable VALUES(concat("00000000-0000-0000-0000-0000000000",itr),concat( "Resource ",itr) );
    SET itr = itr + 1;
    END WHILE;
    END;
    //


    CREATE PROCEDURE INSERTCOUNT()
    BEGIN
    DECLARE itr INT DEFAULT 10;
    DECLARE citr INT DEFAULT 0;
    DECLARE rowfactor INT DEFAULT 2000;
    DECLARE intvl INT DEFAULT 2592000/rowfactor;
    DECLARE stts TIMESTAMP DEFAULT '2008-04-01 00:00:00';

    SET AUTOCOMMIT=0;

    OL : WHILE citr < rowfactor DO
    SET itr = 10;
    IL: WHILE itr < 99 DO
    INSERT INTO datacount VALUES(3, concat("00000000-0000-0000-0000-0000000000",itr), stts );
    SET itr = itr + 1;
    END WHILE IL;
    COMMIT;
    SET citr = citr + 1;
    SET stts = TIMESTAMPADD(SECOND, intvl, stts);
    END WHILE OL;
    END;
    //

    delimiter ;

    call INSERTRESOURCE();

    call INSERTCOUNT();


    create index resource_index on datacount(resourceid(36));
    create index time_index on datacount(cntdate);



    Any help would be really appreciated.

  • #2
    Can you please post EXPLAIN for each of queries which are slow?

    Comment


    • #3
      Here are the results for the EXPLAIN stmt.

      mysql> explain select sum(cnt), datemin, datemax from datacount, resourcetable,
      daterange where date(cntdate) between datemin and datemax and resourcename like
      '% 1%'
      -> group by datemin, datemax;
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | id | select_type | table | type | possible_keys | key | key_len | ref
      | rows | Extra |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
      L | 6 | Using temporary; Using filesort |
      | 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
      L | 89 | Using where |
      | 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
      L | 178275 | Using where |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      3 rows in set (0.70 sec)


      mysql> explain select sum(cnt), datemin, datemax, resourcetable.resourceid from
      datacount, resourcetable, daterange where date(cntdate) between datemin and date
      max and resourcename like '% 1%'
      -> group by datemin, datemax, resourcetable.resourceid;
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | id | select_type | table | type | possible_keys | key | key_len | ref
      | rows | Extra |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
      L | 6 | Using temporary; Using filesort |
      | 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
      L | 89 | Using where |
      | 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
      L | 178275 | Using where |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      3 rows in set (0.00 sec)


      mysql> explain select sum(cnt), resourcetable.resourceid from datacount, resourc
      etable, daterange where date(cntdate) between datemin and datemax and resourcena
      me like '% 1%'
      -> group by resourcetable.resourceid;
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | id | select_type | table | type | possible_keys | key | key_len | ref
      | rows | Extra |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      | 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
      L | 6 | Using temporary; Using filesort |
      | 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
      L | 89 | Using where |
      | 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
      L | 178275 | Using where |
      +----+-------------+---------------+------+---------------+- -----+---------+----
      --+--------+---------------------------------+
      3 rows in set (0.00 sec)

      Comment

      Working...
      X