Announcement

Announcement Module
Collapse
No announcement yet.

myisamchk performance issues.

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • myisamchk performance issues.

    System: Ubuntu 2.6.22-14.46-server (i386 variant) Athlon 64 X2 Dual Core 2.1GHz, 4GB Ram, 250GB WD2500AAJS system drive, 2 x 500GB WD5000AAKS in software RAID-0 for databases, MySQL Server version: 5.0.45-Debian_1ubuntu3.1-log Debian etch distribution.

    I have a database with 10 identically defined MyISAM tables containing from 6.6M to 731M rows (down from 1 table with 1.9B rows in first incarnation!).

    There will be no insertions into the tables once created - it is a research project where the investigator wants to look for relationships; changes will be accommodated by rebuilding all tables from a new analysis.

    I used a scheme I saw documented by Peter somewhere (can't find it now) of creating the un-indexed tables with "load data infile ...", then copying the .frm and .MYI files from an empty table with the indexes defined, followed by "myisamchk --quick --recover ..."


    The following table is used as a source of .MYI and .frm files:

    CREATE TABLE Events_Empty_Indexed (
    x smallint(5) NOT NULL,
    y smallint(5) NOT NULL,
    start_day int(11) NOT NULL default '0',
    year smallint,
    month tinyint,
    day tinyint,
    num_days tinyint unsigned default '1',
    total_precip smallint(5) unsigned default '0',
    max_precip smallint(5) unsigned default '0',
    dist char(11) default '',
    deltas char(11) default '',
    KEY idx1 (x, y, year, month),
    KEY idx3 (start_day),
    KEY idx6 (total_precip),
    KEY idx7 (dist)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

    and this is the definition of all the initial, un-indexed tables:

    CREATE TABLE Events_9 (
    x smallint(5) NOT NULL,
    y smallint(5) NOT NULL,
    start_day int(11) NOT NULL default '0',
    year smallint,
    month tinyint,
    day tinyint,
    num_days tinyint unsigned default '9',
    total_precip smallint(5) unsigned default '0',
    max_precip smallint(5) unsigned default '0',
    dist char(11) default '',
    deltas char(11) default ''
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;


    While I'm not too unhappy with the performance of the "load data..." things can be improved later: I'm currently seeing from 215K to 288K records/sec insertions.

    Question 1 is "what parameters affect "load data infile..." operations?"
    There are no indexes defined at the time so are any key_* parameters important?
    How about read_buffer* and write_buffer*?
    The records are mainly integers so compression probably isn't necessary?

    My main concern is the performance of myisamchk. I stop mysql entirely while I build the tables and no other processes are consuming much resources. I've tried a bunch of different parameter settings for sort_buffer_size (from 100M to 3G), key_buffer_size (from 100M to 3G), read_buffer_size (from 25M to 50M), write_buffer_size (from 25M to 50M), --recover and --parallel-recover, always --quick, and I am disappointed with the performance (36K rows/sec on the big tables up to 92K/sec on the smallest). It takes 5-1/2 hours to build indexes for 731M rows! and all night for the whole database.

    I experimented with a bunch of different settings for buffer sizes etc. on 1 table repeatedly, but found little difference in the time taken, perhaps 10% at the most (parallel won by a very slim margin).

    I recall reading in the forum somewhere that a sort on disk might be faster in some cases than trying to cache a large amount of index. These creation runs need to process full tables, so maybe I'm allocating too much memory for buffers?

    So the accumulated question 2 is: what parameters affect myisamchk the most, and any suggestions for speeding up the index creatation would be gratefully accepted.
Working...
X