GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Query optimisation help please

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Query optimisation help please

    Hi,
    I have a slow query that I could really do with some help optimising please.

    It involves two tables:

    CREATE TABLE IF NOT EXISTS `data` (
    `id` bigint(cool: NOT NULL auto_increment,
    `brand_id` bigint(cool: NOT NULL default '0',
    `pub_date` date NOT NULL default '0000-00-00',
    `concept_id` bigint(cool: NOT NULL default '0',
    PRIMARY KEY (`id`),
    KEY `brand_id` (`brand_id`),
    KEY `pub_date_2` (`pub_date`),
    KEY `brand_id_2` (`brand_id`,`pub_date`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8579258 ;

    CREATE TABLE IF NOT EXISTS `concept` (
    `id` bigint(cool: NOT NULL default '0',
    `concept` varchar(255) NOT NULL default '',
    KEY `concept` (`concept`),
    KEY `id` (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;


    Select c.concept,count(concept) concept1 from concept c,data d where d.pub_date >='2008-12-01' AND d.pub_date <= '2008-12-31' and d.brand_id =32 and d.id = c.id GROUP BY concept ORDER BY concept1 DESC

    So what we are trying to do is find the most frequently occuring concepts from the concept table, for the rows selected in the data table then order and group them.

    The concept table can have many different rows where the ID links back to a single row in the data tables ID.

    So if there is 30000 rows in the data table it could link to 300,000 rows in the concept table. It appears to the group by/order by that is slowing things down.

    These tables are going to continue to grow so I need to find a more scalable option to this.

    Any advice or query/schema changes to speed this up and help the application scale would be great.

    Thanks very much

  • #2
    First thing: KEY `brand_id_2` (`brand_id`,`pub_date`), so where should be:
    Quote:


    where d.brand_id = 32 and (d.pub_date >='2008-12-01' AND d.pub_date <= '2008-12-31')


    I could be wrong, so You could check it with explain
    Quote:


    explain Select c.concept,count(concept) concept1 from concept c,data d where d.pub_date >='2008-12-01' AND d.pub_date <= '2008-12-31' and d.brand_id =32 and d.id = c.id GROUP BY concept ORDER BY concept1 DESC

    explain Select c.concept,count(concept) concept1 from concept c,data d where d.brand_id =32 and d.pub_date >='2008-12-01' AND d.pub_date <= '2008-12-31' and d.id = c.id GROUP BY concept ORDER BY concept1 DESC

    (please, post both results)

    edit:
    You could add table "counter" with columns: year, month, concept and couter.

    Comment


    • #3
      Thanks for your feedback here are the results:

      explain Select c.concept,count(concept) concept1 from concept c,data d where d.pub_date >='2009-01-20' AND d.pub_date <= '2009-01-27' and d.brand_id =273 and c.concept !='".$brand."' and d.id = c.id GROUP BY concept ORDER BY concept1 DESC

      1 SIMPLE d range PRIMARY,brand_id,pub_date_2,brand_id_2 brand_id_2 11 NULL 17561 Using where; Using temporary; Using filesort
      1 SIMPLE c ref concept,id id 8 sentimen_sentimen.d.id 19 Using where

      explain Select c.concept,count(concept) concept1 from concept c,data d where d.brand_id =273 and d.pub_date >='2009-01-20' AND d.pub_date <= '2009-01-27' and c.concept !='".$brand."' and d.id = c.id GROUP BY concept ORDER BY concept1 DESC


      1 SIMPLE d range PRIMARY,brand_id,pub_date_2,brand_id_2 brand_id_2 11 NULL 17561 Using where; Using temporary; Using filesort
      1 SIMPLE c ref concept,id id 8 sentimen_sentimen.d.id 19 Using where


      Any ideas how to better optimise this?

      Thanks in advance

      Comment


      • #4
        hmmm ...
        Explain shows that it will fetch 17561 rows.
        Quote:


        Select count(*) from concept c,data d where d.pub_date >='2009-01-20' AND d.pub_date <= '2009-01-27' and d.brand_id =273 and c.concept !='".$brand."' and d.id = c.id


        Does this query return the same value ?

        Comment


        • #5
          Yes it does return the same amount of rows.

          What I am looking at doing now is just getting the data in mysql
          using

          SELECT c.concept FROM concept c, data d WHERE d.brand_id =273 AND d.pub_date >= '2008-10-20' AND d.pub_date <= '2009-01-27' AND d.id = c.id

          which takes 0.0008ms

          and then using php to format the data using:

          $data = array_count_values($results);
          $arsort($data);

          Seems to be much faster and won't hog mysql resources.

          Any idea if using php in the application to group is generally a better idea than group by in mysql?

          Thanks

          Comment

          Working...
          X