Announcement

Announcement Module
Collapse
No announcement yet.

performance issue

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • performance issue

    Hi,

    The following query is taking 3 mins to execute and table has 30 million rows..


    SELECT host_ip "Users", SUM(events) "Connections", TRUNCATE((SUM(bytes) * 1.0000), 2) "Cost (INR)",
    SUM(bytes) "MBytes" FROM vp_top_summary
    WHERE firewall_report_id IN (1118,1142) AND host_id = -1 GROUP BY host_ip ORDER BY "MBytes" DESC LIMIT 20;

    index information is as follows
    mysql> show index from vp_top_summary;
    +----------------+------------+----------+--------------+--- -----------------+--
    ---------+-------------+----------+--------+------+--------- ---+---------+
    | Table | Non_unique | Key_name | Seq_in_index | Column_name | C
    ollation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
    +----------------+------------+----------+--------------+--- -----------------+--
    ---------+-------------+----------+--------+------+--------- ---+---------+
    | vp_top_summary | 1 | vpts_ihf | 1 | host_ip | A
    | 52290909 | NULL | NULL | YES | BTREE | |
    | vp_top_summary | 1 | vpts_ihf | 2 | host_name | A
    | 52290909 | NULL | NULL | YES | BTREE | |
    | vp_top_summary | 1 | vpts_ihf | 3 | firewall_report_id | A
    | 52290909 | NULL | NULL | YES | BTREE | |
    | vp_top_summary | 1 | vpts_f | 1 | firewall_report_id | A
    | 17 | NULL | NULL | YES | BTREE | |
    | vp_top_summary | 1 | vpts_i | 1 | id | A
    | 52290909 | NULL | NULL | | BTREE | |
    +----------------+------------+----------+--------------+--- -----------------+--
    ---------+-------------+----------+--------+------+--------- ---+---------+
    5 rows in set (0.45 sec)



    any suggestions or ideas will be appreciated..

    thanks in advance..

  • #2
    You could check what explain select ... say.
    Maybe You should add index to the host_id column ?

    Comment


    • #3
      Hi..

      Thank you for your ideas.

      explain of the query is as follows

      +----+-------------+----------------+-------+--------------- +--------+---------+
      ------+---------+------------------------------+
      | id | select_type | table | type | possible_keys | key | key_len |
      ref | rows | Extra |
      +----+-------------+----------------+-------+--------------- +--------+---------+
      ------+---------+------------------------------+
      | 1 | SIMPLE | vp_top_summary | range | vpts_f | vpts_f | 5 |
      NULL | 1607657 | Using where; Using temporary |
      +--
      --+-------------+----------------+-------+--------------
      -+--------+---------+

      Now i have removed the condition "AND host_id = -1 " from the query, so index is not required.

      will the ordering of the index column help?
      as the table has 50 million rows now.. changing index order will take a very long time. unless am sure it makes difference do not want to take a step on that.

      please keep posting your ideas or suggestions if you have any.

      thank you so much!

      Comment


      • #4
        It would scan 1607657 rows (explain sometimes displays very unprecise values).
        Quote:


        #1 select count(*) as rows FROM vp_top_summary
        WHERE firewall_report_id IN (1118,1142) GROUP BY host_ip
        #2 select count(*) as rows FROM vp_top_summary
        WHERE firewall_report_id IN (1118,1142)


        #1 query (I hope I didn't make any mistake in it) will show how many rows would be received. #2 query will show how many rows mysql have to fetch before grouping. Compare it with that 1.6M from explain. Is it smaller ?


        I can see only one way to solve the problem. This one table can stay, but You should add second one with single row per firewall_report_id and host_ip ({firewall_report_id,host_ip} UNIQUE). The cron could run updating script every minute. After that You will get less rows without grouping. The problem is still with that order by. I have to think about it. Right now I have got no idea how to deal with it.

        Comment


        • #5
          I hope this reply isn't too late.

          Have you tried:


          SELECT *FROM ( SELECT host_ip "Users", SUM(events) "Connections", TRUNCATE((SUM(bytes) * 1.0000), 2) "Cost (INR)", SUM(bytes) "MBytes" FROM vp_top_summary WHERE firewall_report_id IN (1118,1142) AND host_id = -1 GROUP BY host_ip)ORDER BY "MBytes" DESC LIMIT 20;


          with another index on (firewall_report_id, host_ip)?

          Comment


          • #6
            Hi,
            Thanks you for your reply..
            it's not too late for ideas or suggestions coz i still have it in my list didn't find a solution for it yet..
            I have tried the derived table concept as well, didn't help!

            Comment

            Working...
            X