Announcement

Announcement Module
Collapse
No announcement yet.

Performance with INSERT on huge tables

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Performance with INSERT on huge tables

    I have a very simple table consisting of 2x 64bit integers. "startingpoint" and "endingpoint".
    Im trying to insert 68 million rows in the table.
    While inserting the rows, the memory consumption rise very fast, and when the memory is full, the insertion rate rises to a very high level.
    The table structure is as below. As you see, the startingpoint and endingpoint is together a primary key.
    Most of the time i will do insertions to the table and it HAS to check for unique keys (starting/endingpoint).
    When the table is complete (has reached the specified amount of rows), i have to do a single select. The performance of the SELECT doesn't really matter as it only has to be done once, but i need to be able to sort it by the endingpoint column.

    CREATE TABLE `z_chains_999` (
    `startingpoint` bigint(cool: unsigned NOT NULL,
    `endingpoint` bigint(cool: unsigned NOT NULL,
    PRIMARY KEY (`startingpoint`,`endingpoint`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1 DELAY_KEY_WRITE=1;

    Below is the output from my small app which inserts 1 million chains with a single INSERT ... VALUES ()... call.
    To the left is the total amount of chains inserted (1 million / call). Then the time it takes to insert the 1 million rows, and next the amount of rows which was actually inserted (Some rows can be duplicates)
    As you can see below, the time it takes to insert 1 million rows increases drastically after 24 million rows are in the DB. At this time, the MySQL server consumes 1.1GB of memory, but the data itself is only 2x8 bytes * 25 million rows = 381MB.
    I will recieve around 250.000 rows every minute, so this is not acceptable. The key_buffer is set to "key_buffer = 700M" in my.ini. If i raise this to 1000M, i can insert a few more million rows, but not enough. I need to be able to insert at least 400 or 500 million rows in a single table

    Any idea how i can further optimize the system to be able to handle more rows? The server has 2 GB of memory.

    1000000 chains inserted [10 seconds] (999992 rows inserted)
    2000000 chains inserted [10 seconds] (999992 rows inserted)
    3000000 chains inserted [11 seconds] (999997 rows inserted)
    4000000 chains inserted [11 seconds] (999987 rows inserted)
    5000000 chains inserted [11 seconds] (999991 rows inserted)
    6000000 chains inserted [12 seconds] (999984 rows inserted)
    7000000 chains inserted [12 seconds] (999989 rows inserted)
    8000000 chains inserted [12 seconds] (999993 rows inserted)
    9000000 chains inserted [11 seconds] (999983 rows inserted)
    10000000 chains inserted [12 seconds] (999987 rows inserted)
    11000000 chains inserted [12 seconds] (999987 rows inserted)
    12000000 chains inserted [12 seconds] (999990 rows inserted)
    13000000 chains inserted [12 seconds] (999988 rows inserted)
    14000000 chains inserted [13 seconds] (999991 rows inserted)
    15000000 chains inserted [12 seconds] (999987 rows inserted)
    16000000 chains inserted [12 seconds] (999984 rows inserted)
    17000000 chains inserted [12 seconds] (999989 rows inserted)
    18000000 chains inserted [12 seconds] (999992 rows inserted)
    19000000 chains inserted [13 seconds] (999990 rows inserted)
    20000000 chains inserted [12 seconds] (999988 rows inserted)
    21000000 chains inserted [12 seconds] (999989 rows inserted)
    22000000 chains inserted [13 seconds] (999987 rows inserted)
    23000000 chains inserted [12 seconds] (999987 rows inserted)
    24000000 chains inserted [12 seconds] (999986 rows inserted)
    25000000 chains inserted [632 seconds] (999992 rows inserted)
    26000000 chains inserted [1174 seconds] (999994 rows inserted)

    Content of my.ini

    [mysqld]
    port = 3306
    socket = /tmp/mysql.sock
    skip-locking
    key_buffer = 700M
    max_allowed_packet = 32M
    table_cache = 512
    sort_buffer_size = 10M
    read_buffer_size = 2M
    read_rnd_buffer_size = 8M
    myisam_sort_buffer_size = 64M
    thread_cache_size = 8
    query_cache_size = 32M
    # Try number of CPU's*2 for thread_concurrency
    thread_concurrency = 4
    bulk_insert_buffer_size = 100M

  • #2
    I encountered a similar problem and I believe it is due to index.

    very weird symptoms, though...

    I have 100 million rows and load time consistently increased to a point that in 24 hours, still was not complete. Weird thing is the CPU usage also basically went to zero. Not sure if it was from the process being in constant IO wait or what (not sure how to measure on winXP), but was a real dog.

    If anyone has ideas, I would love to hear them...

    Dropped the index and loaded the entire table in 3.5 hours without getting cute with the load application. created the index post load in 1 hour.

    Seems pretty good. Til I needed to create the next index... Slowwwwwwww... Dont remember exact time but was over 12 hours...

    So I guess if you only have a single index you will ever need, you are OK. If you need more, you will need to get more creative...

    Comment


    • #3
      It seems to me like when memory gets "dirty" it does not get flushed out in a quick manner as to make room for new rows...

      Is there some setting, like on oracle, to set the percent dirty pages to clean down to?

      Comment

      Working...
      X