GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Is It max performance on one server or did I missed something ?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    >> Long term solution buy more RAM!!!!

    no, the real solution is switching to a real search engine.

    Comment


    • #17
      Here I am :

      I revisited the query to avoid the "copy to tmp table".
      I throwed away the sum() and so the group by and no more "copy to tmp table" this time with one searched word and performance gain seems great.


      SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, (ws0.Score) as uriScoreFROM vtpUriINNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id)WHERE ws0.Word_Id = 136ORDER BY uriScore desc LIMIT 0,400

      Explain

      id select_type table type possible_keys key key_len ref rows Extra1 SIMPLE ws0 ref PRIMARY PRIMARY 3 const 24746 Using where; Using filesort1 SIMPLE vtpUri eq_ref PRIMARY PRIMARY 3 vetSpider6.ws0.Uri_Id 1 Using index


      Thought I reached the goal BUT ...

      When I search more than one word explain says "Using temporary" again.


      SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, (ws0.Score + ws1.Score) as uriScoreFROM vtpUriINNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id)INNER JOIN vtpWordScore AS ws1 ON (ws1.Uri_Id = vtpUri.Id) WHERE ws0.Word_Id = 136 AND ws1.Word_Id = 5110ORDER BY uriScore desc LIMIT 0,400

      Explain

      id select_type table type possible_keys key key_len ref rows Extra1 SIMPLE ws1 ref PRIMARY PRIMARY 3 const 627 Using temporary; Using filesort1 SIMPLE ws0 eq_ref PRIMARY PRIMARY 6 const,vetSpider6.ws1.Uri_Id 1 1 SIMPLE vtpUri eq_ref PRIMARY PRIMARY 3 vetSpider6.ws1.Uri_Id 1 Using index


      I don't understand why MySql uses temporary on such simple jointure.
      Do you know ?

      Comment


      • #18
        In reading back through this thread, it is clear that you are getting a variety of different advice, which is all trial-and-error kind of stuff. This in my experience can waste months. I would encourage you to take a more scientific approach, such as Method R or Goal-Driven Performance Optimization (see http://www.percona.com/about-us/white-papers)

        Comment


        • #19
          Theses advices have given already some performance gains,
          but they mostly gave me clues on where to look and a much better understanding of the possible causes of the problem.
          I am very grateful for that )

          I'm not even sure on this material without full-text like search engine it is possible to do better.

          Anyway, thank you for your link I will read it carefully.

          Comment


          • #20
            Yes optimize table was not needed, I didn't think of that you probably had recreated the entire table.

            If we break down the query a bit.
            How long does it take to run only this query:

            SELECT Uri_Id ,score FROM vtpWordScore ws0 WHERE ws0.Word_Id = 41

            If that query is slow, try it with this index:

            ALTER TABLE vtpWordScore ADD INDEX ws_ix_wordid_uriid_score(word_id, uri_id, score);

            The advantage with this index is that your DBMS don't have to read any data from the table since all data is available in the index and hereby saving search times.
            The drawback with it is that the amount of data for that table on your harddrive will almost double in size and inserts/updates will be slower.
            But since your problem is disk seeks I think it might be worth it.

            Quote:


            I don't understand why MySql uses temporary on such simple jointure.
            Do you know ?


            I would say that the reason why MySQL uses a temporary table is because you are sorting on a calculated value retrieved from two tables.
            MySQL doesn't know this value until the join has been performed so it needs to store the intermediate result of the join in a temporary table before it can sort it.

            Comment


            • #21
              The Select took at the first try 7.21 sec with all time on "Sending data".
              I added the index as you told and "magic" ! Timings on same Select (with some other word_Id) are all < 0.1sec

              Ps : Is there a way to execute same query two times with exactly same execution (ie no optimisations) on the second one ?
              Because it's very hard to test requests with changing word_Id on all of them.
              fake example (I want to avoid this) :

              SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS * from toto;400 rows in set (34.69 sec)SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS * from toto;400 rows in set (0.36 sec)


              So I removed index and included score in Primary, now my Primary looks like your index, I also Ordered table by Word_Id and Optimized it.



              So now, vtpWordScore is fast, but when I join it to vtpUri "copy to temp table" again and long times are back...

              I'm trying to optimise vtpUri, this one contains mediumtext etc... I suppose if I move theses in other table and just keep what I need on this request (Id, Rank, Domain_Id) It will run fast.
              It sounds like "horizontal partitioning" but I'm on Mysql 5.0.44 I've never done that before and quite affraid to break something If I update Mysql.

              Comment


              • #22
                xaff wrote on Thu, 10 June 2010 13:30

                Ps : Is there a way to execute same query two times with exactly same execution (ie no optimisations) on the second one ?


                No, not really, you have caching on OS level and caching in MySQL and when you have selected data it will be present in the cache until something else replaces it.
                You can run warm up scripts that executes a lot of queries in sequence for a longer time to try to place the machine and MySQL in the same state between tests but to get reproducible results is hard.

                xaff wrote on Thu, 10 June 2010 13:30

                So I removed index and included score in Primary, now my Primary looks like your index,


                I understand why you did it since it reduces data, but you do know that you now allow more than one score per (word_id, url_id) combination so you have change the restraint check of this table?

                xaff wrote on Thu, 10 June 2010 13:30


                I'm trying to optimise vtpUri, this one contains mediumtext etc... I suppose if I move theses in other table and just keep what I need on this request (Id, Rank, Domain_Id) It will run fast.


                No don't move them to another table, just create an index that contains all columns that you are selecting as the last columns in the index, the same as I did in the previous index I created for you:

                ALTER TABLE vtpUri ADD INDEX vtpuri_ix_id_domainid_rank( Id, domain_id, rank);

                Then MySQL will see that all data needed for both finding the correct record and what is selected will be part of the index and hence it will not read from the table so you will save seeks in this case also.

                Comment


                • #23
                  Sterin71 : do know that you now allow more than one ...
                  Ooops , I missed that ) so I will add contraint or replace index + primary at the end.

                  Sterin71 : No don't move them to another table...
                  Oh yes you right, I should see it.
                  I added index but,there still "f#%?! copying to temp table ^^" it seems because of "... * (vtpDomain.Rank + vtpUri.Rank)", without those EXPLAIN no more says Using Temporary.


                  SELECT SQL_NO_CACHE HIGH_PRIORITY SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, (ws0.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore, vtpUri.Domain_Id FROM vtpUri INNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id) INNER JOIN vtpDomain ON (vtpDomain.Id = vtpUri.Domain_Id) where ws0.Word_Id = 4598 AND vtpDomain.vtpCategory_Id != 2 AND vtpDomain.IsOff != 1 ORDER BY uriScore desc LIMIT 0,400

                  So I will try to place the result of this operation vtpWordScore.Score * (vtpDomain.VetupRank + vtpUri.Rank) directly in vtpWordScore.Score to remove them from the request.
                  The application will be less configurable but, I think and hope everything will be fine in term of performance.

                  Comment


                  • #24
                    xaff wrote on Fri, 11 June 2010 11:50


                    I added index but,there still "f#%?! copying to temp table ^^" it seems because of "... * (vtpDomain.Rank + vtpUri.Rank)", without those EXPLAIN no more says Using Temporary.


                    Yes you might still get copying to temp table, but what is actually the performance at this point after the last index and how many rows you got left in the result set?

                    I ask this question since the WHERE conditions and the JOIN should have reduced this result set to a manageable size, or do you still a lot of rows so that the intermediate table is large?
                    You should calculate this size and set the tmp_table_size values about maybe 15-20% larger than the size of this temporary table instead of 1000M as you have now.

                    And on a side note you can rewrite your query like this, where you place the conditions for each table in the JOIN condition instead of the WHERE. It makes it easier to read and know which indexes that should exist on each table:

                    SELECT SQL_NO_CACHE HIGH_PRIORITY SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, (ws0.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore, vtpUri.Domain_Id FROM vtpUri INNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id) AND ws0.Word_Id = 4598 -- <-- hereINNER JOIN vtpDomain ON (vtpDomain.Id = vtpUri.Domain_Id) AND vtpDomain.vtpCategory_Id != 2 -- <-- here AND vtpDomain.IsOff != 1 -- <-- hereORDER BY uriScore desc LIMIT 0,400

                    Comment


                    • #25
                      I don't have enough experience to kwow what is "Large" for Mysql.
                      But here is some elements :
                      The request with a common searched word "rage" :

                      SELECT SQL_NO_CACHE HIGH_PRIORITY SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, vtpUri.Domain_Id, (ws0.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore FROM vtpUri INNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id) AND ws0.Word_Id = 4598 INNER JOIN vtpDomain ON (vtpDomain.Id = vtpUri.Domain_Id) AND vtpDomain.vtpCategory_Id != 2 AND vtpDomain.IsOff != 1 ORDER BY uriScore DESC LIMIT 0,400;

                      --> rage, id: 4598, wordScore occurences : 7430
                      time : 11.29 sec; Copying to tmp table : 10.65 sec

                      Explain :

                      +----+-------------+-----------+--------+-------------------------------------------------+---------+---------+-----------------------------+------+----------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-----------+--------+-------------------------------------------------+---------+---------+-----------------------------+------+----------------------------------------------+| 1 | SIMPLE | ws0 | ref | PRIMARY | PRIMARY | 3 | const | 7414 | Using index; Using temporary; Using filesort || 1 | SIMPLE | vtpUri | eq_ref | PRIMARY,FK_UriDomain,vtpuri_ix_id_domainid_rank | PRIMARY | 3 | vetSpider8.ws0.Uri_Id | 1 | || 1 | SIMPLE | vtpDomain | eq_ref | PRIMARY,FK_DomainCategory,idxIdRank | PRIMARY | 2 | vetSpider8.vtpUri.Domain_Id | 1 | Using where |+----+-------------+-----------+--------+-------------------------------------------------+---------+---------+-----------------------------+------+----------------------------------------------+


                      Here some other try, order by count(WordScore) desc ^^ :

                      --> rage, id: 4598, wordScore occurences : 7430
                      time : 11.29 sec; Copying to tmp table : 10.65 sec
                      --> rss, id 172524, wordScore occurences : 47644
                      time : 21.64 sec; Copying to tmp table : 21.61 sec
                      --> torsion, id: 8698, wordScore occurences : 3918
                      time : 10.53 sec; Copying to tmp table : 10.53 sec
                      --> estomac, id: 8354, wordScore occurences : 2253
                      time : 8.13 sec;
                      --> marquise, id 47644, wordScore occurences : 26
                      time : 0.45 sec; Copying to tmp table : 0.40 sec

                      As you can see there is always a copy to tmp table and the performances varies with the occurences of wordScore.

                      Comment


                      • #26
                        Ok, I tried my 1st idea to eliminate the scores addition during query and it didn't work because :
                        There effectively no more "copy to tmp table" but there is a "sending data" which takes almost same time.

                        The 2nd idea was to split vtpUri in two tables :
                        vtpUriLight (Id, Domain_Id, Rank)
                        vtpUriHeavy (rest : varchars, text,...)
                        and add a view vtpUri = select vtpUriLight + vtpUriHeavy

                        This one works very well with your indexes, there is still a "copy to tmp table" but this one became light(8Mo) and fast

                        Now all requests respond "in no time", the result is beyond my expectations and the problem is solved.

                        My greatest thanks guys, especially to you Sterin71.
                        I owe you one.

                        Comment


                        • #27
                          Nice to hear that it's solved, although it could have been enough to get MySQL to use the vtpuri_ix_id_domainid_rank index on the vtpUri table, which it for some reason didn't do.

                          While you should still think about increasing the amount of RAM on your machine (unless you want to go with a real search engine as suggested in this thread).
                          Your main problem with the "copy to tmp table" is actually probably that it has a hard time to find the records, not that it has such a big problem with actually creating the temporary table.

                          And since all this is basically only I/O wait then you would benefit a lot from getting more RAM.

                          A well, good thing that you are happy with the performance! )

                          Comment


                          • #28
                            Once again you're right Sterin71, when i force the use of the index like this :

                            SELECT SQL_NO_CACHE HIGH_PRIORITY SQL_NO_CACHE SQL_CALC_FOUND_ROWS vtpUri.Id as UriId, vtpUri.Domain_Id, SUM(ws.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore FROM vtpUri USE INDEX (vtpuri_ix_id_domainid_rank)INNER JOIN vtpWordScore AS ws ON (ws.Uri_Id = vtpUri.Id) AND (ws.Word_Id = 27460 OR ws.Word_Id = 37255)INNER JOIN vtpDomain ON (vtpDomain.Id = vtpUri.Domain_Id) AND vtpDomain.vtpCategory_Id != 2 AND vtpDomain.IsOff != 1 GROUP BY vtpUri.IdORDER BY uriScore DESC LIMIT 0,400;

                            The result is the same, so split vtpUri was useless )

                            As you can see in the previous explain on vtpUri MySql don't choose it by default.

                            I don't understand why because this index regroups on vtpUri all what is ask by the request : (Id, Domain_Id, Rank).

                            Maybe the presence of some other index on this table causes trouble ?

                            Comment

                            Working...
                            X