GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

How to speed up Group BY?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to speed up Group BY?

    Hello,
    I'm a new mysql developer, just started working on performance.
    I have large table as 50 mln rec for now ( still growing).

    This is a fact table.

    I execute the statement in my stored proc :
    create temporary table tt as
    SELECT SrcUserGroupID, SrcReportObjectID,
    DestReportObjectID, FirewallRuleID,
    RootCauseID, ServiceID,
    Action, SUM(Counter) as Counter
    FROM mytable
    where DataSourceID=2 and Datex between '2009-02-23 20:00:00' and '2009-02-24 19:00:00'
    group by SrcUserGroupID,SrcReportObjectID,DestReportObjectI D, FirewallRuleID,RootCauseID,ServiceID, Action

    I'm not sure how to tune Group By,
    I even created an index on all Group By fields and made the order in select the same, but this index is not used.

    from status - I'm not using disk temp tables.
    what else could I check?

    I do have have an index on Datex,DataSourceID.

    Please advice!

    thank you.
    Helen

  • #2
    A good place to start for tuning any query is the EXPLAIN command. If you simply prefix your query with EXPLAIN (like EXPLAIN SELECT SrcUserGroupID...) MySQL will give you a high level query execution plan. In particular it will tell you which indexes are being used and roughly how many rows will need to be examined.

    Chances are you are probably just reading more rows than necessary. What exactly are the indexes on this table? You can get this by using SHOW CREATE TABLE mytable or SHOW INDEXES FROM mytable.

    Comment


    • #3
      I checked that Explain many times,
      I have index on each ID and one compound index for Datex,SourceID . Explain shows me that index but in extra it has " temporary,filesort " - that is becouse of a group by I think.

      Why do you think I read more rows then I need?
      I have only 1 filter - on Datex and SourceID.
      I found a way how to make use of a partition pruning on that table ( my partition function is dayofmonth), that speed up my query 2.5 times, but I still need bettr speed.

      do you know what is "filesort " ?
      Thank you very much for the fast reply!

      ----------+
      | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
      +----+-------------+-------+------+------------------+------ +---------+------+----------+------------------------------- ---------------+
      | 1 | SIMPLE | f | ALL | ev_hrly_date_src | NULL | NULL | NULL | 28847656 | Using where; Using temporary; Using filesort |
      +----+-------------+-------+------+------------------+------ +---------+------+----------+------------------------------- ---------------+

      Comment


      • #4
        Ok, so the explain plan is telling you a few important things. First, it tells you the indexes MySQL could use to help evaluate your query (possible_keys). In your case, that's only ev_hrly_date_src. It isn't even considering using anything else. The key column tells you which index it decided to use, if any. In your case, key is NULL, which means MySQL has decided its better to use a table scan than to use the index. So its always going to read 28 million rows.

        The reason for this is probably the ordering of your composite index. Composite indexes work best with identity lookups (columnA = 1 AND columnB =1). When you are trying to lookup a range, it cannot use any part of the composite index which comes after (to the right of) your range column. So asking an index on (Datex, SourceID) about Datex BETWEEN x AND y AND SourceID = 1 is no different from asking about Datex BETWEEN x AND y.

        If your index were instead (SourceID, Datex), then you would be able to use the entire composite index, which would be much more selective.

        Does that make sense?

        Comment


        • #5
          To tell you the truth, I was sceptical about that.
          And I created that index.
          And the procedure runs about 10 times faster !!!!!! And web application does not hang on that page! eek:

          Thank you!

          Happy Helen.

          Comment

          Working...
          X