Announcement

Announcement Module
Collapse
No announcement yet.

Is It max performance on one server or did I missed something ?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is It max performance on one server or did I missed something ?

    Hello experts,

    I developped sometimes ago an indexer + search engine with mysql.
    Now the database weight is 15Go and the query on it became sometimes very long.
    For today on 75 request 24 are longer than 2 seconds :


    SELECT * FROM `vtpLog` where `timeResponse` > 2 order by date desc; Time ResultCountvétérinaire 126.35 54482perte d'identité 8.76 948%de femmes vétérinaires 9.04 73pancreatite rat 3.23 23dépilation chinchilla 3.75 1trichophyton 3.2 319tumeur de la prostate 8.63 96pathologies du dindon 2.26 15Hypothyroidie canine 3.62 274gestation de la chienne 18.49 363vermifugation chienne 4.53 26vermifugation chienne gestante 6.33 9hypothyroidie 5.07 1032acupuncture 18.33 11association antibiotiques 26.63 1055pyothorax chat 5.68 45</pre>


    When mysql takes long time to respond the servers state seems ok :
    cpu : 20 - 40 %
    mem : 1.7Go used, 2Go Page cache, 300mo free
    hdd : reading only

    I documented myself, tuned-up mysql, re-check indexes for month but no real progress confused: so I'd like an mysql expert advice to tell me what I've done stupid )

    I don't plan (for now) to recode all to use &quot;full text&quot;.

    So here are the informations :

    Server :
    This server is dedicated for this search engine; there are only apache + mysql with maximum memory allocated on mysql.

    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">
    CPU : Core(TM)2 E8400 @3.00GHz GenuineIntel
    MEM : 4 Go DDR2
    HDD : 2x 750 Go - SATA2 (RAID 1)

    </td></tr></table>

    Script :
    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">
    /*================================================= ========= ====*/
    create table vtpCategory
    (
    Id int not null auto_increment,
    Parent_Id int,
    Value varchar(254),
    primary key (Id)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpDomain
    (
    Id int not null auto_increment,
    vtpCategory_Id int not null,
    IsOff bool default 0 ,
    LastFarmSuccess bool default 0 ,
    Rank int default 50,
    StartUri varchar(254) not null ,
    Title varchar(254) ,
    Language varchar(254) ,
    LastVisitDate datetime ,
    primary key (Id),
    key AK_StartUriUnique (StartUri)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpHtmlSource
    (
    Id int not null auto_increment,
    Uri_Id int,
    Value mediumtext,
    primary key (Id)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpUri
    (
    Id int not null auto_increment,
    Domain_Id int not null,
    Value varchar(2040) not null,
    ValueHash int not null,
    IsDead bool default 0 ,
    Content mediumtext ,
    ContentHash int,
    Title varchar(254) ,
    Description varchar(254) ,
    UpdateDate datetime ,
    Rank int default 0 ,
    primary key (Id),
    key AK_ValueHash (ValueHash)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpUriOutsideDomain
    (
    Id int not null auto_increment,
    Domain_Id int,
    Value varchar(254),
    ValueHash int,
    FromUriValueHash int ,
    FromUriValue varchar(254) ,
    primary key (Id)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpWord
    (
    Id int not null auto_increment,
    Value varbinary (254) not null ,
    primary key (Id),
    key AK_Value (Value)
    )
    type = MyISAM;

    /*================================================= ========= ====*/
    create table vtpWordScore
    (
    Uri_Id int not null,
    Word_Id int not null,
    Score int ,
    primary key (Uri_Id, Word_Id)
    )
    type = MyISAM;

    alter table vtpCategory add constraint FK_CategoryCategory foreign key (Parent_Id)
    references vtpCategory (Id) on delete restrict on update restrict;

    alter table vtpDomain add constraint FK_DomainCategory foreign key (vtpCategory_Id)
    references vtpCategory (Id) on delete restrict on update restrict;

    alter table vtpHtmlSource add constraint FK_HtmlSourceUri foreign key (Uri_Id)
    references vtpUri (Id) on delete restrict on update restrict;

    alter table vtpUri add constraint FK_UriDomain foreign key (Domain_Id)
    references vtpDomain (Id) on delete restrict on update restrict;

    alter table vtpUriOutsideDomain add constraint FK_OutsideDomainDomain foreign key (Domain_Id)
    references vtpDomain (Id) on delete restrict on update restrict;

    alter table vtpWordScore add constraint FK_Uri foreign key (Uri_Id)
    references vtpUri (Id) on delete restrict on update restrict;

    alter table vtpWordScore add constraint FK_Word foreign key (Word_Id)
    references vtpWord (Id) on delete restrict on update restrict;

    </td></tr></table>

    Query in two times :
    1- get WordIds
    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">

    SELECT Id FROM vtpWord WHERE (CAST(vtpWord.Value AS CHAR CHARACTER SET utf8) = 'vétérinaire');

    </td></tr></table>
    Returns 41 here;
    This one is quite fast (0.1610 sec without query cache)

    2- sort by scores
    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">
    SET SESSION SQL_BIG_SELECTS = 1;
    SELECT SQL_CALC_FOUND_ROWS
    vtpUri.Id as UriId,
    SUM( ws0.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore,
    vtpUri.Domain_Id
    FROM
    vtpUri
    INNER JOIN vtpWordScore AS ws0 ON (ws0.Uri_Id = vtpUri.Id)
    INNER JOIN vtpDomain ON (vtpDomain.Id = vtpUri.Domain_Id) where ws0.Word_Id = 41 AND vtpDomain.vtpCategory_Id != 2 AND vtpDomain.IsOff != 1
    GROUP BY ws0.Uri_Id
    ORDER BY uriScore desc LIMIT 0,400

    </td></tr></table>

    Mysql States :
    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">
    MySQL work since 35 Days 5 hours 16 minutes 15 seconds.
    Slow_queries 4 304
    Handler_read_rnd 1 670 k
    Handler_read_rnd_next 19 G
    Created_tmp_disk_tables 1 765
    Opened_tables 157
    Table_locks_waited 829 k

    </td></tr></table>

    my.cnf :
    <table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText">Quote:</td></tr><tr><td class="quote">
    skip-locking
    max_connections = 40
    key_buffer_size = 2500M # cf mysqltuner.pl
    max_allowed_packet = 500M
    table_cache = 200 # table_open_cache empeche de demarrer
    open_files_limit = 600 # il y avais 4200 # should typically be set to at least 2x-3x of table_cache
    sort_buffer_size = 4M
    read_buffer_size = 4M
    read_rnd_buffer_size = 4M
    myisam_sort_buffer_size = 32M
    thread_cache_size = 8
    query_cache_size = 16M # == query_cache_limit
    query_cache_limit = 16M
    join_buffer_size = 3M
    low_priority_updates = 1

    tmp_table_size = 3400M # doit == max_heap_table_size
    max_heap_table_size = 3400M

    #net_buffer_length = 8K
    #skip-external-locking
    #log-slow-queries = /var/log/mysql/slow_queries.log

    # Try number of CPU's*2 for thread_concurrency
    thread_concurrency = 8

    </td></tr></table>

    I hope you can show me the light
    Last edited by xaff; 04-14-2013, 12:32 PM.

  • #2
    SELECT Id FROM vtpWord WHERE (CAST(vtpWord.Value AS CHAR CHARACTER SET utf8) = 'vétérinaire');


    Rewrite this query to the form WHERE vtpWord.Value = blahblah('veterinaire') and it will be fast.

    Explain output for query 2?

    If you install a real search engine, you will be much better off.

    Comment


    • #3
      hello gmouse, i ordered answers by difficulty :

      1°/ 1st query : ok cast moved, thanks ^^

      3°/
      Quote:

      If you install a real search engine, you will be much better off.

      Can you give me advice on which one to choose, and possibly a list of names?



      2°/
      Quote:

      Explain output for query 2?

      The query 2:
      returns :

      UriId uriScore Domain_Id175247 3600 46178075 3600 4690506 3600 46

      It returns an ordered list of the 400 1st UriId according uriScore.
      vtpWordScore is a sum representing the number of occurrences of a word for an uri, it is calculated and set by the BDD indexer.
      uriScore is the sum of vtpWordScore each search word multiplied by Rank.
      The Rank score serve to up/degrade certain uriScore according to some criteria (if it is an index or a technical paper ... etc.)

      )

      Comment


      • #4
        Explain has a different meaning in MySQL-world ) http://dev.mysql.com/doc/refman/5.1/en/explain.html


        Try Sphinx or DBSight as search engine.

        Comment


        • #5
          Oups, shame on me, I forgot this one

          Last edited by xaff; 04-14-2013, 12:33 PM.

          Comment


          • #6
            Do you ever query vtpWordScore on uri_id? If so, you could probably gain a lot by switching the order of the PK fields (and removing the 2nd index). Otherwise, check if adding uri_id to the second index as a postfix makes a big difference.
            And consider creating covering indices for the joined tables.

            Comment


            • #7
              I don't very well understood what you mean by &quot;Do you ever query it vtpWordScore uri_id?&quot;, I simply use query 2 to get all results id that's all.
              Anyway I have duplicated database and changed indexes according to your Recommendation:

              Explain have less rows on the first line but unfortunately there is no significant time gain, query 2 still take 30 to 60 seconds to execute.

              Last edited by xaff; 04-14-2013, 12:33 PM.

              Comment


              • #8
                and covering indices for the joined tables?

                Comment


                • #9
                  I don't think it will help because of this &quot;degraded query 2&quot; takes the same time to execute and it works only on vtpWordScore :


                  So the problem comes from vtpWordScore... or maybe 86 187 809 rows is too much on this system ... but in this hypothesis why the system is not workloaded ?

                  here the system loads (I launched the 53sec query at 19h43) :
                  Last edited by xaff; 04-14-2013, 12:34 PM.

                  Comment


                  • #10
                    I'm suspicious about your disk IO.

                    Can you check the I/O wait when you are running this query by running "vmstat 2 10" and/or "top -b -n 10" and attach the text output from these to this thread?

                    Because your chart of Bytes/s doesn't really tell us so much if:
                    1. There are a lot of random very small reads = bytes/sec is very small but IO wait on disk is very high.
                    2. The chart shows an average value and only samples ever so often then the average value of a few samples can give a very distorted impression.


                    Now for some of your server variables.
                    As I understood it you have 4GB of RAM, right?
                    And on the same server you are running both MySQL and Apache.

                    Then some of these values sound very high:

                    key_buffer_size = 2500M # cf mysqltuner.pl### This sounds very high, usually set to about 25% of _available_ RAM since MyISAM tables relies on the file cache, unless you know that basically _all_ your queries can fetch all data from the indexes and doesn't have to read from the table.###max_allowed_packet = 500M ### Also sounds insanely high, any reason? ###sort_buffer_size = 4M### Sounds pretty low, I recommend about 16MB, but this shouldn't be such a big deal breaker ###tmp_table_size = 3400M # doit == max_heap_table_sizemax_heap_table_size = 3400M### These two are also insanely high, with only 4GB in the machine and even though you are alone on it the key_buffer_size setting combined with this would probably put you over the top.###

                    Comment


                    • #11
                      hi sterin71 ),

                      Here are results of asked commands launched when server idle and during query, so you can compare results if you want :


                      sterin71: For some of your server variables :
                      -Yes 4GB Ram on server.
                      -Yes Apache + Mysql + named (forgot this one, it's a pre-installed machine with some services but I dont uses them)

                      sterin71: Then some of these values sound very high:
                      Yes I know but for now the goal is just performance not stability, so I tried very big values to see if ... ^^ to keep little stability, I lowed max_connections to 40.

                      key_buffer_size 2.5G because I tried to fit all indexes in RAM.
                      max_allowed_packet = 500M by convenience, just to raise max upload size scripts in phpMyAdmin, and &quot;max_...&quot; don't impact on performances, I'm Wrong ?
                      sort_buffer_size = 4M I just remember It's a memory eater so I put it to 4M to not to be too far over memory limit for 40 connexions.
                      tmp_table_size = 3400M + max_heap_table_size = 3400M Same as max_allowed_packet there are &quot;max_..&quot; values so it can crash but will not drop performances.

                      Anyway I'm ok to try all values you'll give me to test )
                      Last edited by xaff; 04-14-2013, 12:34 PM.

                      Comment


                      • #12
                        I wouldn't increase sort_buffer_size, I'd lower it (actually comment it out, and let it go back to the default). You can find a number of good blog posts about this on mysqlperformanceblog.com. 4MB is already very harmful for small fast queries. Remember this buffer is allocated to its full size, even for small queries.

                        Comment


                        • #13
                          I changed some "values" on Id and FK, examples :
                          Id INT(11) => Id MEDIUMINT(6)
                          Reduced sizes and some perf gain.

                          @Stein71 : I changed these :
                          key_buffer_size = 1000M
                          tmp_table_size = 1000M
                          max_heap_table_size = 1000M
                          sort_buffer_size = 16M then commented it

                          but no noticeable changes.
                          And something quite strange Virt memory seems in use even if there ram free.

                          top -b | grep mysql13696 mysql 40 0 1058m 198m 4556 S 0 5.0 0:02.39 mysqld13696 mysql 40 0 1066m 198m 4556 S 8 5.0 0:02.62 mysqld13696 mysql 40 0 1067m 199m 4556 S 3 5.0 0:02.72 mysqld


                          Is there a way to keep Virt to 0 ?

                          Another strange thing is there it seems that only one cpu is used.

                          Why mysql wouldn't use both processors ?
                          I have this in my.cnf

                          # Try number of CPU's*2 for thread_concurrencythread_concurrency = 8


                          I'm trying to uping mysql's processes priority.

                          Ps : excuse my poor english, I can read it quite easely but it's hard for me to make phrases... I'm probably not well configured too :roll:

                          Comment


                          • #14
                            OK, as I suspected you are getting a huge IO-wait during the execution of the query. The last column is the amount of time the cpu is spending waiting for the disk and at 48-49% it is basically not doing anything else except waiting:

                            procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 54696 731996 40928 2396816 0 0 10510 64 991 2195 0 2 49 49 0 1 54696 723076 40932 2405640 0 0 4436 0 660 1205 0 1 50 49 0 1 54696 715132 40932 2413724 0 0 3998 0 621 1061 0 2 50 48

                            The reason why you don't have 99% is that you have a dual core server (I'm assuming) and in MySQL one query can only utilize one core.

                            The problem here is that you have a very large index and table size since it looks like vtpWordScore is 3.5GB and you are joining with a table that is 2GB of data, etc.

                            So the steps for the server is to:
                            1. Find the correct page in the index with Word_Id = 41
                            2. For each of the records found in 1 fetch the corresponding data from the next table that is also very big.
                            3. Then join this with the records from the two tables, etc.

                            So what can we do about it?
                            1.
                            Have you run "OPTIMIZE TABLE vtpWordScore;"?
                            I don't know how long time it will take on your machine (but it might be a long time), but it would probably be worth it due to the size of your table compared to the RAM.
                            By running the optimize table you are resorting the entire index and finding the right records should be much faster.

                            2. Also try rewriting your query this way (not tested):

                            SELECT SQL_CALC_FOUND_ROWSvtpUri.Id as UriId,SUM( ws0.Score) * (vtpDomain.Rank + vtpUri.Rank) as uriScore,vtpUri.Domain_IdFROM vtpUriINNER JOIN ( SELECT Uri_Id ,score FROM vtpWordScore WHERE ws0.Word_Id = 41 ) AS ws0 ON ws0.Uri_Id = vtpUri.Id INNER JOIN vtpDomain ON vtpDomain.Id = vtpUri.Domain_Id AND vtpDomain.vtpCategory_Id != 2 AND vtpDomain.IsOff != 1WHEREGROUP BY ws0.Uri_IdORDER BY uriScore descLIMIT 0,400

                            My theory is that since you don't seem to select too many rows from the vtpWordScore table an execution that involves first getting all records from one table creating a derived table and then performing a search on the next table would be better.
                            Please test it and post the result and the EXPLAIN output here.

                            2.
                            Long term solution buy more RAM!!!!
                            If you are on a 32bit system then it's unfortunate since you can't utilize any more than 4GB, but with such big database and I'm guessing performance constraints you should at least consider for the future moving it to 64Bit and more RAM.

                            Comment


                            • #15
                              Stein71 : Have you run &quot;OPTIMIZE TABLE vtpWordScore;&quot;?
                              No I thought it was useless because I created a new database &quot;vs2&quot; reimported struct+datas from script; anyway I launched the command but times are same as before.

                              2. Also try rewriting your query this way
                              Changed few things to make it run :
                              - removed 2nd &quot;where&quot;
                              - replaced ws0.Word_Id =41 by vtpWordScore.Word_Id =41
                              Sadly it still takes time to execute (20 sec) when query is not in query_cache.

                              I found some things witch may help :
                              1- The use of SQL_NO_CACHE seems so save some seconds (hard to be sure because of query_cache keeps me from doing same test two times with same execution time)

                              2- SHOW PROFILE source FOR QUERY 1; gives us :

                              | Copying to tmp table | 21.697665 | exec | sql_select.cc | 1999 |</pre>

                              All the time is there so :
                              - why does mysql do that ?
                              - Is there a way to keep it from doing this copy or if not :
                              - Is there a way to force him to copy to a &quot;memory&quot; tmp table ?
                              Last edited by xaff; 04-14-2013, 12:34 PM.

                              Comment

                              Working...
                              X