GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

slow SELECT COUNT(*) on VPS server

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • slow SELECT COUNT(*) on VPS server

    Hi !

    This is my first port here, so I hope I'll do it right :roll:

    Actually, I have:
    - a VPS server with 256mb of ram only.
    - a site with about 30 concurent users
    - a datebase with one table that have 300,000 rows, and is 500mb big.

    This table contain user messages, so I need to generate a pagination. Pagination require to know the total of message for a user.

    From my log files, I can see this:

    Reading mysql slow query log from /var/log/mysql/mysql-slow.logCount: 109 Time=10.02s (1092s) Lock=0.98s (107s) Rows=1.0 (109), user[user]@localhost SELECT count(id) FROM cc3_he WHERE display=N AND IDPerso=N


    (N mean Number)
    So this is the problem: all my pages are taking 10sec+ to load !


    The fields:
    - id int(12)
    - IDPerso smallint(5)
    - display smallint(1)
    - date int(10)
    - from varchar(25)
    - from_id int(5)
    - from_id2 int(5)
    - from_description text
    - from_description2 text
    - type varchar(20)
    - msg longtext
    - msg_size int(10)

    My indexes on the table are:
    - PRIMARY BTREE Oui Non id 312679 A
    - IDPerso BTREE Non Non IDPerso 379 A
    - date BTREE Non Non date 312679 A
    - display BTREE Non Non display 2 A



    Someone have an idea ?
    Thanks for reading )

  • #2
    Extend your index in IDPerso to include display also. And if you use MyISAM, change count(id) to count(*).

    But it is better to keep this count in the usertable and update it when a user makes a post.

    Comment


    • #3
      Thanks for the reply, I'm trying this right now.

      But can you explain me why MyISAM is faster with COUNT(*) , and what it the difference in creating 1 index with 2 fields instead of 2 indexes ?

      Thanks !


      Edit:

      Quote:

      But it is better to keep this count in the usertable and update it when a user makes a post.

      Really ? This is what GMail or Hotmail are doing ? They *never* COUNT the messages ?

      Comment


      • #4
        >> But can you explain me why MyISAM is faster with COUNT(*)
        data for the id field does not need to be retrieved

        >> and what it the difference in creating 1 index with 2 fields instead of 2 indexes ?

        http://www.mysqlperformanceblog.com/2009/09/12/3-ways-mysql- uses-indexes/

        Comment


        • #5
          I might be a little dumb, but I still don't understand what is different between 2 indexes and 1 index of 2 fields.

          The whole article talk about using 1 index with 2 fiels... confused:

          Comment


          • #6
            Say you have something like:

            SELECT product_id FROM sales WHERE category_id = 1 AND salesperson_id = 2;

            The sales table has a million rows, 50,000 of which match category_id 1 and 5,000 of which match salesperson_id 2. But there are only 100 rows that match both. You have two indexes, one on category_id and one on salesperson_id.

            When MySQL goes to evaluate this query, it will only ever use one index per relation (table) to retrieve the data. If it chooses to use the category_id index, it will read 50,000 rows off disk and filter them by salesperson_id in memory, and return to you 100 rows. The same goes for the salesperson_id index, except that it will read 5,000 rows and filter by category_id in memory.

            In general, the name of the game is to reduce disk I/O, so you go with the index that results in the fewest rows read from disk.

            Now let's say you drop those two indexes and instead add a single index (category_id, salesperson_id). Then you'll only be reading 100 rows when you evaluate this query.

            While it sounds much better, if you were to change the query to

            SELECT product_id FROM sales WHERE salesperson_id = 2;

            That index we just added can't be used, and we'll get a table scan (we read 1 million rows), because we must provide the prefix (category_id, in this case), in order for the compound index to be useful. The article gmouse suggested explores this in detail.

            Comment


            • #7
              Ahhh ok !

              So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

              [page_id, zone_id, user_id], content, foo, bar, etc


              And I want to find all the zone_id of a user page:

              SELECT zone_id FROM table WHERE page_id=? AND user_id=?;

              I should add another index:
              [page_id, user_id]

              ... because my primary key can't be used, and the query will do a full table scan ?



              Edit:
              By the way, is it true that MySQL read the WHERE statement from Right to Left ? (The right statement must be the one that discard the most result)

              Comment


              • #8
                Nope. Since you are supplying page_id, MySQL will use the primary key, because you have specified left part of the key.

                An index is kinda like a table that's already sorted. So your primary key would look a lot like the result of this query:

                SELECT page_id, zone_id, user_id FROM mytable ORDER BY page_id, zone_id, user_id;

                If you look at the results of that query, it should be easy for you to visually find all the rows where page_id = 1, right? They're all clumped together nicely.

                But if you're just looking for user_id = 1 without a page_id, well, all of the sudden its not very easy to do visually, because they seem to be all over the place, because they are sorted first by page_id, which you aren't asking about, and then again by zone_id.

                So if you asked for page_id = 1 and zone_id = 2, visually, you'd be able to find the rows easily, because you'd first find where the rows for page_id = 1 start in your results, and then start looking for zone_id = 2. The same goes if you add in user_id.

                Does that make sense?

                Comment


                • #9
                  Yes, that make sense. Just to be sure, I've modified the senario by swapping fields, could you tell me if #1 and #2 are OK or not ?

                  #1:
                  Quote:


                  So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

                  [zone_id, page_id, user_id], content, foo, bar, etc


                  And I want to find all the zone_id of a user page:

                  SELECT zone_id FROM table WHERE page_id=? AND user_id=?;



                  #2:
                  Quote:


                  So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

                  [page_id,zone_id, user_id], content, foo, bar, etc


                  And I want to find all the zone_id of a user page:

                  SELECT zone_id FROM table WHERE user_id=? AND page_id=?;



                  Thanks !

                  Comment


                  • #10
                    Scenario #1 doesn't use the primary key* and scenario #2 will use the page_id part of the key.

                    *Not to confuse you more, but technically this query in scenario #1 will be evaluated only using the primary key, since the key contains all the columns required to produce the result set without any table lookups. If you selected out a column not in the index, then a table scan would result instead. You can actually see this in the output of EXPLAIN when "using index" shows up.

                    Comment


                    • #11
                      Ok, now I get it (well I think) )

                      #1
                      It would use the primary key, but as it's all scrambled, it would be slower.

                      #2
                      The order of the WHERE fields are not usefull.

                      Comment


                      • #12
                        Just to be completely clear let's hit a few scenarios. Say we have

                        [zone_id, page_id, user_id], content, foo, bar, etc

                        as you mentioned.


                        Here are queries, going from fastest to slowest:

                        SELECT * FROM mytable WHERE zone_id=? AND page_id=? AND user_id=?;

                        The order the columns appear in the where clause is irrelevant, so this is just as fast:

                        SELECT * FROM mytable WHERE user_id=? AND zone_id=? AND page_id=?;

                        Now slower:

                        SELECT * FROM mytable WHERE zone_id=? AND page_id=?

                        Slower still:

                        SELECT * FROM mytable WHERE zone_id=? AND user_id=?

                        just as slow as the above

                        SELECT * FROM mytable WHERE zone_id=?

                        slower still

                        SELECT * FROM mytable WHERE user_id=?

                        just as slow as the above

                        SELECT * FROM mytable WHERE page_id=?

                        etc.

                        Comment


                        • #13
                          Wow, really, I've learned alot. So in a way, I'm really greatfull to you. But in another way, I kind of hate you for all the work I must now do to fix my databases )

                          Comment

                          Working...
                          X