Announcement Module
No announcement yet.

13GB DB - Performance Question

Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • 13GB DB - Performance Question

    I have a 13 GB MYSQL table with the following definition:

    CREATE TABLE `WikiParagraphs` ( `ID` int(10) unsigned NOT NULL auto_increment, `Paragraph` text NOT NULL, PRIMARY KEY (`ID`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=73035281 ;

    I query it like this:

    select Paragraph from WikiParagraphs where ID in (1,2,3,4)

    the 1,2,3,4 bit comes from the Sphinx FullText engine that gives me the IDs I need that match my query within about 500 milliseconds.

    But retrieving the data itself takes approximately 3-6 seconds.

    Obviously, I'd like to speed this query up.

    Any ideas?

  • #2
    How much RAM memory do you have on the server and how much have you reserved for the InnoDB cache (innodb_buffer_size)?

    Because this is very important to get speed with innodb.

    Apart from that I have a suggestion:

    select Paragraph from WikiParagraphs where id = 1UNIONselect Paragraph from WikiParagraphs where id = 4UNIONselect Paragraph from WikiParagraphs where id = 6

    It will give const searches for each select instead of a range scan which might go faster if your smallest and biggest value in the IN() part is far apart and you don't want to retrieve that many rows.

    mysql> explain select Paragraph from WikiParagraphs where id = 3 -> UNION -> select Paragraph from WikiParagraphs where id = 4 -> UNION -> select Paragraph from WikiParagraphs where id = 6 -> ;+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+| 1 | PRIMARY | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | || 2 | UNION | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | || 3 | UNION | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | ||NULL | UNION RESULT | | ALL | NULL | NULL | NULL | NULL | NULL | |+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+4 rows in set (0.00 sec)


    • #3
      Well, I'm getting 1000 results back from Sphinx and I'm using those to get about 55k of data that I then massage.

      I probably don't need all 1000 results but I need at least 55k of data returned.

      I tried doing single queries and I also tried OR = stuff but neither of those sped it up (slowed it down a bit).

      How do I tell how much memory INNODB is using for this?

      I should note that I also have MYISAM tables on the same server.


      • #4
        Ooh. wait. Just noticed that UNION there.

        That's rather interesting. I would have never thought of doing that.

        I'll give that a shot and see how it performs.


        • #5
          1000 rows was a bit more than I thought but I hope it works for you. )

          To get the info about server settings you can use this query:

          show variables like 'innodb%';


          • #6
            Well, if it's faster, I could break it down into chunks of 50 or 100 until I have enough data I guess...

            +---------------------------------+-------------------------------+| Variable_name | Value |+---------------------------------+-------------------------------+| have_innodb | YES || innodb_additional_mem_pool_size | 12582912 || innodb_autoextend_increment | 8 || innodb_buffer_pool_awe_mem_mb | 0 || innodb_buffer_pool_size | 134217728 || innodb_data_file_path | innodb/ibdata1:10M:autoextend || innodb_data_home_dir | || innodb_fast_shutdown | ON || innodb_file_io_threads | 4 || innodb_file_per_table | OFF || innodb_flush_log_at_trx_commit | 1 || innodb_flush_method | || innodb_force_recovery | 0 || innodb_lock_wait_timeout | 30 || innodb_locks_unsafe_for_binlog | OFF || innodb_log_arch_dir | || innodb_log_archive | OFF || innodb_log_buffer_size | 8388608 || innodb_log_file_size | 33554432 || innodb_log_files_in_group | 2 || innodb_log_group_home_dir | ./ || innodb_max_dirty_pages_pct | 90 || innodb_max_purge_lag | 0 || innodb_mirrored_log_groups | 1 || innodb_open_files | 300 || innodb_table_locks | ON || innodb_thread_concurrency | 8 |+---------------------------------+-------------------------------+

            The server has 4GB of memory.


            • #7
              It's not faster if you break it into 50 or 100.
              The only advantage is that you could break when you have got all data instead of reading in all 1000 rows.

              But you are only using 134MB for InnoDB_buffer_size.

              How much memory do you have available on the server?

              And are you using the server for something else like webserver or something?

              Because I would suggest that you increase the innodb_buffer_pool_size to 80% of available memory.


              • #8

                This server has 4GB of memory. I'm using as the MYISAM, INNODB and web server all in one - LOL. Too much for one server to do?



                • #9
                  Do you have a reason for using both MyISAM and InnoDB?

                  That is, are you using both FULLTEXT and transactions?
                  Or have you run into the update/select performance problem with MyISAM?

                  Otherwise I would suggest that you choose to use only one storage engine and tune the server for that one.


                  • #10
                    I'm actually not using transactions at all right now nor do I expect to be with this data.

                    The only reason I added innodb was because figured it would help to not lock the entire table when creating these 13-25gb tables. Now that they're created the point is kinda moot but converting it back to myisam (reimporting the data) is likely to take a week or more unless you know a quick way to do it?

                    Fulltext is done via the sphinx engine so I'm not using that part of myisam either.


                    • #11
                      Are you performing a lot of updates against the table?
                      Or is it basically only selects?
                      I'm guessing selects since it looks like a wiki but you never know. )

                      This is the quickest way (but you don't have any control over it):

                      ALTER TABLE yourTable ENGINE=MyISAM;

                      The only question is how long your server will take to convert it.

                      So that's why I suggest that you create a smaller table with maybe 500MB data and test on that one instead so you can get a feel for how long it takes.

                      And depending on the time estimate from your test it could be better to perform the move yourself:
                      Create a myisam copy of your table. _Without_ indexes. Those will be added later.
                      INSERT INTO yourMyISAMCopy SELECT * FROM yourInnoDBoriginal
                      Create the indexes. And make sure that you give mysql a _lot_ of memory for this step. Because create index can work in two ways and one is _very_ slow when it as to work against disk all he time.

                      Step 2 could be split into portions by
                      ... WHERE id BETWEEN 1001 AND 2000.
                      ... WHERE id BETWEEN 2001 AND 3000.

                      Or some appropriate value for how many you want to move at a time if you can't take all in one go.


                      • #12
                        I really wouldnt recommend the alter table, talking from experiance (50 gig table) it will be slow, and crummy!

                        I do queries similar to:
                        select Paragraph from WikiParagraphs where ID in (1,2,3,4, *1000)

                        and im looking at subsecond queries!

                        I think you should improve the buffers your applying to these tables sufficiently!

                        One thing that will help you, is do all the matching with sphinx, do any phpside ordering of the data and then only pull the data you actually need! This should help greatly, in terms of speed at least.


                        • #13
                          Well, since it's a static table and my other tables are in myisam I've decided to move over to myisam.

                          The problem that I'm having tho is when I try this query:

                          insert into WikiParagraphs2 select * from WikiParagraphs

                          It drops about 40 million rows...

                          I tried a mysql dump but there are odd chars that make it barf.

                          Any ideas?


                          • #14
                            You could try forcing the charsets?

                            Are both tables in the same character set anyway?

                            Oh wait, are you SURE it dropped the rows? innoDB does not keep a record of the actual amount of rows in the table, so it may be that?

                            The only way (i can think of) is to do something similar to:

                            SELECT count(*) FROM innoDBtable

                            But that may take a long time, you could try two seperate count()'s from a boolean field (do a search for true and one for false) may be quicker.


                            • #15
                              When I saw the number of rows not matching up, I did a count(*).

                              It took about 30 minutes to do that.

                              I've tried keeping the charsets the same as well as changing charsets.

                              Neither option worked.

                              I suppose I could write a php script to do this by hand but good lord that'll be slowwwwwww )