InnoDB vs MyISAM vs Falcon benchmarks – part 1

Several days ago MySQL AB made new storage engine Falcon available for wide auditory. We cannot miss this event and executed several benchmarks to see how Falcon performs in comparison to InnoDB and MyISAM.
The second goal of benchmark was a popular myth that MyISAM is faster than InnoDB in reads, as InnoDB is transactional, supports Foreign Key and has an operational overhead. As you will see it is not always true.


For benchmarks I used our PHPTestSuite which allows to test wide range tables and queries.
The script and instruction are available here:
https://www.percona.com/blog/files/benchmarks/phptestsuite.stable.tar.gz

We used table “normal” table structure which corresponds to typical structure you would see in OLTP or Web applications – medium size rows, auto increment primary key and couple of extra indexes.

In this benchmark we used only read (SELECT) queries with different typical data access patterns:
primary key single row lookup, primary key range lookup, same access types for primary key and full table scans.

To highlight different properties of storage engines we tested ranges with and without LIMIT clause, and tested queries which
need to read the data or can only be satisfied by reading the index.

This benchmark is so called “micro” benchmark which concentrates on particular simple storage engine functions and we use it to see performance and scalability in this simple cases. We also use CPU bound workload in this case (no disk IO) to see how efficient storage engines are in terms of CPU usage. In real life workload results are likely to be very different.

The schema and queries are described here

Used hardware

CentOS release 4.4 (Final)
2 Ñ… Dual Core Intel XEON 5130

model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
stepping : 6
cpu MHz : 1995.004
cache size : 4096 KB

16GB of RAM

MySQL version
We used MySQL 5.1.14-beta sources for MyISAM / InnoDB
and MySQL 5.1.14-falcon bitkeeper tree
bk://mysql.bkbits.net/mysql-5.1-falcon for Falcon
(Please note this is a first release of Falcon and it is still in alpha stage and performance parameters may vary a lot in next releases)
Compilation parameters:

mysqld startup params:

Method of benchmark:
1. Prepare table with 1,000,000 records (about 350Mb of data on disk)
2. Run each query for 1, 4, 16, 64, 128, 256 concurrent threads.
3. For each thread perform a warm-up run (duration 180 sec), and then
run three effective runs (duration of each is 60 sec).
As the final result we get a maximal result of three runs.

The raw numbers are available here:
https://www.percona.com/blog/files/benchmarks/innodb-myisam-falcon.html
(Note: This benchmark is synthetic micro benchmarks focusing on particular simple data access patterns. Results for your workload are likely to be different.)

There are interesting results I want to show graphics with comments

READ_PK_POINT
READ_PK_POINT
Query: SELECT name FROM $tableName WHERE id = %d
The very common query with access by primary key.
InnoDB is faster than MyISAM by 6-9%.
Falcon shows very bad scalabilty.

READ_KEY_POINT
READ_KEY_POINT
Query: SELECT name FROM $tableName WHERE country_id = %d
In this case Falcon is the best, because Falcon uses a tricky technic to retrieve rows (more
details with Jim Starkey’s comments in Part 2).
There MyISAM shows bad scalability with increasing count of thread. I think the reason is pread system
call MyISAM uses to access data and retrieving from OS cache is not scaled.

READ_KEY_POINT_LIMIT
READ_KEY_POINT_LIMIT
Query: SELECT name FROM $tableName WHERE country_id = %d LIMIT 5
The same query as previous but with LIMIT clause.
Due to Falcon’s way of key access Falcon cannot handle LIMIT properly and that is why
we see bad performance. We hope the performance of LIMIT queries will be fixed before release.
MyISAM shows stable result.
InnoDB is better than MyISAM by 58% in case with 4 threads, but does not scale good enough.
Perhaps there is still a problem with InnoDB mutexes.

READ_KEY_POINT_NO_DATA
READ_KEY_POINT_NO_DATA
Query: SELECT state_id FROM $tableName WHERE country_id = %d
This query is similar to previous READ_KEY_POINT with only different the values of accessed column is stored in key. MyISAM and InnoDB handle this case and retrive the value only from key.
InnoDB is better by 25-30%.
Falcon needs an access to data beside key access, and most likely this will not be fixed, as this is
specific Falcon’s way to handle multi-versioning. I think this is a big weakness of Falcon, as ‘using index’ is very common optimization we use in our practice.

READ_KEY_POINT_NO_DATA_LIMIT
READ_KEY_POINT_NO_DATA_LIMIT
Query: SELECT state_id FROM $tableName WHERE country_id = %d LIMIT 5
The previous query but with LIMIT.
Again the LIMIT is bad for Falcon.
InnoDB is better than MyISAM by 87% in case with 4 threads but drops down very fast.

READ_PK_POINT_INDEX
READ_PK_POINT_INDEX
Query: SELECT id FROM $tableName WHERE id = %d
Simple but very quick query to retrieve value from PK.
The results for InnoDB and MyISAM are comparable and I think this shows both engines are maximally optimized and the result is maximal that can be reached for this query.
Falcon scales pretty bad and there is a big room for optimization.

READ_PK_RANGE
READ_PK_RANGE
Query: SELECT min(dob) FROM $tableName WHERE id between %d and %d
Access by range of PK values.
MyISAM scales very bad, and reason is the same as for READ_KEY_POINT queries.
InnoDB is better than MyISAM by 2-26 times
and than Falcon by 1.64 – 3.85 times.

READ_PK_RANGE_INDEX
READ_PK_RANGE_INDEX
Query: SELECT count(id) FROM $tableName WHERE id between %d and %d
MyISAM scales good here, because of access only to key column and ‘pread’ syscall is not used.

READ_KEY_RANGE
READ_KEY_RANGE
Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d
As in case with READ_KEY_RANGE Falcon is the best here.
Falcon’s resuts better than InnoDB by 10-30%
MyISAM drops down with 128-256 threads

READ_KEY_RANGE_LIMIT
READ_KEY_RANGE_LIMIT
Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50
Again Falcon does not hanle LIMIT and the results are much worse.

READ_KEY_RANGE_NO_DATA
READ_KEY_RANGE_NO_DATA
Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d

READ_KEY_RANGE_NO_DATA_LIMIT
READ_KEY_RANGE_NO_DATA_LIMIT
Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50

READ_FTS
READ_FTS
Query: SELECT min(dob) FROM $tableName
The hardest query performs a scan of all million rows.
InnoDB is better than MyISAM by ~30% with 4-16 threads, but MyISAM scales a bit better in this case.
InnoDB is better than Falcon by 2-3 times.

Share this post

Comments (119)

  • Moyed Ansari

    Nice Stats, very useful
    Anyways thanks :)

    January 8, 2007 at 12:00 am
  • najdnc

    I clearly think that the data given is relevant to everybody under the sun . Many thanks .
    Locksmith Nashville

    January 8, 2007 at 12:00 am
  • Johan Bergström

    i can’t even check out the sources; it’s been bugging me the entire weekend 🙁

    OK-root OK
    ERROR-BAD CMD: export, Try help

    January 8, 2007 at 2:26 am
  • peter

    As I remember Larry has just changed the protocol, so get new free client here:
    http://www.bitkeeper.com/Hosted.Downloading.html

    January 8, 2007 at 2:50 am
  • David Shrewsbury

    Very good post and very informative. The InnoDB vs. MyISAM performance numbers were just as interesting as the performance of Falcon.

    -Dave

    January 8, 2007 at 6:04 am
  • Vadim

    Johan,

    You need new free client and
    MySQL moved the falcon tree to new root
    bk://mysql.bkbits.net/mysql-5.2-falcon

    But I cannot clone it too:

    ./bkf clone bk://mysql.bkbits.net/mysql-5.2-falcon mysql-5.2-falcon
    ERROR-unable to lock repo for export, try later.

    Perhaps there is a maintenance, and the tree will be available later.

    January 8, 2007 at 7:27 am
  • Xaprb

    I wanted to compare disk usage too. For the same table structure and data, how large is each engine’s data and index file? Do you have this information?

    January 8, 2007 at 8:03 am
  • peter

    Falcon is shy telling its disk usage in SHOW TABLE STATUS:

    show table status\G
    *************************** 1. row ***************************
    Name: normal
    Engine: Falcon
    Version: 10
    Row_format: Dynamic
    Rows: 1000
    Avg_row_length: 0
    Data_length: 10000
    Max_data_length: 0
    Index_length: 0
    Data_free: 0
    Auto_increment: 1000001
    Create_time: NULL
    Update_time: NULL
    Check_time: NULL
    Collation: latin1_swedish_ci
    Checksum: NULL
    Create_options:
    Comment:
    1 row in set (0.00 sec)

    The file size was 354MB which is data + indexes together.

    MyISAM:

    Data_length: 167519584
    Index_length: 132239360

    Innodb:

    Data_length: 195772416
    Index_length: 156794880

    January 8, 2007 at 9:14 am
  • bp

    Interesting results. Obviously Falcon isnt as mature as the other 2 engines. Though I am surprised about how InnoDB is better than MyISAM in a lot of areas. I’m not really informed about this stuff but, do you think MyISAM would perform better on smaller (single proc) machines? I would be interested in seeing how each performs on different machines like single proc, dual proc, and quad proc.

    January 8, 2007 at 9:50 am
  • LenZ

    Thanks for the hint with regards to the updated BK client. I have updated my blog entry on how to compile MySQL from source. However, the locking problem still persists, we are working with the BitKeeper people on resolving this issue.

    January 8, 2007 at 9:51 am
  • pabloj

    Thanks for the great benchmarks!
    Hope Jim will take this into account and work hard on Falcon, as this results are a bit of a “Falcon-killer”
    Also hope that MySQL will set for one or two preferred engines focusing efforts on those, as I see all this storage engine plethora as essentially duplicated efforts and unneeded complexity added, I mean that MyISAM and one of the transactional engines (InnoDB, Falcon, whatever they this is better) should be enough for most cases, leaving SoliDB and other for special cases.
    It would be interesting to run your benchmarks against Firebird/Vulcan to see if and how Jim’s new brainchild improves over the older designs.
    Regards

    January 8, 2007 at 9:52 am
  • LenZ

    Just to let you know: we are aware of the problems with creating a clone with the free BK client and have contacted BitKeeper support to resolve this ASAP. Sorry for the inconvenience.

    January 8, 2007 at 10:13 am
  • Charles

    These numbers are devastating. I fear that they’ll be used as FUD against Falcon when it’s mature and actually performs well.

    That being said, it is interesting to see that InnoDB ends up having equal or better read performance compared to MyISAM. That really changes my perceptions of both of them…

    January 8, 2007 at 11:28 am
  • Vadim

    2 bp:

    I don’t know how MyISAM will perform on single- or dual- CPU boxes in 5.1.14 release,
    at least I expect the scalability will be good enough.

    January 8, 2007 at 1:36 pm
  • peter

    Yes, results are not great for Falcon so far. The worst problem though with LIMIT is already identified and Jim will look into fixing it.

    The good thing is we have now these results early and Jim will have time to look into these.

    Someone can misuse results, which is very frequent with benchmarks results but it is not much you can do about it.

    January 8, 2007 at 2:20 pm
  • Breezes

    Interesting result of Falcon and between MyISAM and InnoDB. However I still think Falcon is helpful, since this test is carried out when all data resides in memory, so the advantage of Falcon’s record cache will not show.

    BTW: Why Falcon needs an access to data beside key access in the READ_KEY_POINT_NO_DATA. I know Falcon use private in-memory caches to hold modified index entries for each update transactions, but I still don’t know why. Could someone be so kind to tell me the reason?

    January 9, 2007 at 2:18 am
  • MySQL

    Results are interesting, especialy MyISAM vs InnoDB.

    January 9, 2007 at 2:36 am
  • peter

    Breezes,

    Actually this should show one of benefits of Falcon record cache – retrieving record from record cache generally should be faster as you do not need to lookup page by row_id and then look for the row on the page itself which could be implemented less efficiently.

    Regarding why Falcon can’t only read data from Index as I understand there are two reasons.

    1) Falcon stores collation values in the index not the data itself which means it is not always to perform reverse conversion and get real data from the index.

    2) Falcon index structure is rather minimal (to keep indexes short) this means you can’t say from index records itself it it should be visible inside current transaction etc.

    I personally thing this is serious matter as I frequently use covering indexes for optimization.

    January 9, 2007 at 3:08 am
  • BOLK

    Very useful. Thank you!

    January 9, 2007 at 7:57 am
  • Aleksey Kishkin

    Hi! Nowadays one million is pretty small number of records for database. Did you try similar benchmarks against, (say) 100 000 000 records? Or – (as we usualy did) against database 3 times bigger than RAM?

    January 15, 2007 at 3:31 am
  • peter