Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Join performance of MyISAM and Innodb

May 29, 2006

Author

Peter Zaitsev

Benchmarks

Share this Post:

We had discussion today which involved benchmarks of Join speed for MyISAM and Innodb storage engines for CPU bound workload, this is when data size is small enough to fit in memory and so buffer pool.

I tested very simple table, having with about 20.000 rows in it on 32bit Linux. The columns “id” “i” and “c” were populated with same integers so we can allow the same job to be done using different kinds of columns – primary key, integer indexed column and indexed char column. The query is also trivial – the point was to make sure it is not index covered query so it reads the rows and it does not return many rows. I varied the join clause to be id, i and C columns appropriately.

CREATE TABLE `t1` (
  `id` int(10) unsigned NOT NULL default '0',
  `i` int(10) unsigned NOT NULL default '0',
  `c` char(15) default NULL,
  `pad` char(8) default NULL,
  PRIMARY KEY  (`id`),
  KEY `i` (`i`),
  KEY `c` (`c`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

select count(t1.pad),count(t2.pad) from t1,t1 t2 where t1.id=t2.id;

CREATE TABLE `t1` (

`id` int(10) unsigned NOT NULL default '0',

`i` int(10) unsigned NOT NULL default '0',

`c` char(15) default NULL,

`pad` char(8) default NULL,

PRIMARY KEY (`id`),

KEY `i` (`i`),

KEY `c` (`c`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8;

select count(t1.pad),count(t2.pad) from t1,t1 t2 where t1.id=t2.id;

The result I’ve got are as follows

Storage Engine	ID	I	C
MyISAM	0.24s	0.27s	1.19s
Innodb	0.07s	0.30s	0.38s

As you see in such circumstances Innodb is actually faster than MyISAM in 2 cases out of 3. I guess the reasons are the following:

- Innodb primary key joins are very fast as data is clustered together with index and generally highly optimized

- Innodb builds hash indexes which helps to speed up lookup by indexes by passing BTREE index and using hash, which is faster

- MyISAM does compression for character keys which makes it perform slower for random lookups

- MyISAM generally has lower processing overhead due to its simplicity

- MyISAM still a bit better by primary key join than for secondary key join. I guess because it knows for sure there is no more than one row which matches the index, so there is no need for MySQL to request next row matching index

Note: This applies to CPU bound workload with all content fitting in memory. In other cases situation is very different and MyISAM compression for char keys could frequently positevely impact performance.

0 0 votes

Article Rating

4 Comments

Oldest

Newest Most Voted

stefan minka`

19 years ago

how different becomes the situation, when a join both myisam and innodb tables together.

exampe:
– table a is myisam, table b is innodd
– select … from a inner join b on a.x=b.x

additional question: lets assume both tables will be big, so will the table b (innodb) be locked by row, or as whole?

Author

Peter Zaitsev

19 years ago

Feel fee to test it Stefan 🙂

Generally for join the table which gets random lookup is more important – table which is having full table scan or range scan contributes less to total prformance.

So in our case performance will be similar to join of two Innob tables.

…Also no locks will be happening. Innodb does not lock rows for normal selects, consistent reads are used instead.

Jatin Mehta

16 years ago

QUERY:
SELECT SQL_CALC_FOUND_ROWS p.*, FLOOR(p.prodratingtotal/p.prodnumratings) AS prodavgrating, 0 AS prodgroupdiscount, pi.* , (IF(p.prodname=’gold’, 10000, 0) + IF(p.prodcode=’gold’, 10000, 0) + ((MATCH (ps.prodname) AGAINST (‘gold’)) * 10) + MATCH (ps.prodname,ps.prodcode,ps.proddesc,ps.prodsearchkeywords) AGAINST (‘gold’)) AS score FROM products p LEFT JOIN product_images pi ON (p.productid = pi.imageprodid AND pi.imageisthumb = 1) INNER JOIN product_search ps ON p.productid = ps.productid WHERE p.prodvisible = 1 AND (ps.prodcode = ‘gold’ OR TRUE) AND (MATCH (ps.prodname,ps.prodcode,ps.proddesc,ps.prodsearchkeywords) AGAINST (‘gold’)) ORDER BY score DESC LIMIT 20

EXECUTION TIME: 2.5000+ seconds

TABLES DATA:
products: 31,000 records
product_images: 92,000 records
product_search: 57,000 records

EXPLAIN COMMAND WITH ABOVE QUERY:
1 SIMPLE ps fulltext prodname prodname 0 1 Using where; Using temporary; Using filesort
1 SIMPLE p eq_ref PRIMARY,i_products_rating_vis,i_products_added_vis,i_products_sortorder_vis PRIMARY 4 shoppingcart_5521.ps.productid 1 Using where
1 SIMPLE pi ref i_product_images_imageprodid i_product_images_imageprodid 5 shoppingcart_5521.p.productid,const 1

Rick James

14 years ago

Jatin,
What is your point? Please provide SHOW CREATE TABLE for each table if you wish further discussion. A couple of comments, anyway:

Note that it starts with FULLTEXT, which is what it likes to do.

It seems very wrong to have
AND (ps.prodcode = ‘gold’
AND (MATCH (…ps.prodcode,…) AGAINST (‘gold’))
If prodcode = ‘gold’, then yes, it will “MATCH” ‘gold’.