The PBXT Storage Engine (http://www.primebase.com/xt/) is getting stable and we decided to benchmark it in different workloads.
This time I tested only READ queries, similar to ones in benchmark InnoDB vs MyISAM vs Falcon
The difference is I used new sysbench with Lua scripting language, so all queries were scripted for sysbench.
New sysbench you can get from svn repository:
svn co https://sysbench.svn.sourceforge.net/svnroot/sysbench
The used LUA script for sysbench
The mysql.start script with startup parameters for InnoDB and PBXT
and script to run whole loop of benchmarks
The special thanks to Benjamin Grant, John Glass and Paul Elliott for access to modern multiCPU servers, without them I would not be able to run benchmarks.
Description of benchmark:
We used table “normal” table structure which corresponds to typical structure you would see in OLTP or Web applications – medium size rows, auto increment primary key and couple of extra indexes.
CREATE TABLE IF NOT EXISTS `$tableName` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(64) NOT NULL default '',
`email` varchar(64) NOT NULL default '',
`password` varchar(64) NOT NULL default '',
`dob` date default NULL,
`address` varchar(128) NOT NULL default '',
`city` varchar(64) NOT NULL default '',
`state_id` tinyint(3) unsigned NOT NULL default '0',
`zip` varchar(8) NOT NULL default '',
`country_id` smallint(5) unsigned NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`),
KEY `country_id` (`country_id`,`state_id`,`city`)
In this benchmark we used only read (SELECT) queries with different typical data access patterns:
primary key single row lookup, primary key range lookup, same access types for primary key and full table scans.
To highlight different properties of storage engines we tested ranges with and without LIMIT clause, and tested queries which
need to read the data or can only be satisfied by reading the index.
This benchmark is so called “micro” benchmark which concentrates on particular simple storage engine functions and we use it to see performance and scalability in this simple cases. We also use CPU bound workload in this case (no disk IO) to see how efficient storage engines are in terms of CPU usage. In real life workload results are likely to be very different.
The schema and queries are described here
Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
2 Ñ… Quad Core Intel XEON
cpu family : 15
model : 6
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 8
cpu MHz : 3192.222
cache size : 8192 KB
32GB of RAM
Fedora Core release 5 (Bordeaux)
2 Ñ… Dual Core AMD Opteron
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 2210
stepping : 2
cpu MHz : 1800.163
cache size : 1024 KB
4GB of RAM
We used MySQL 5.1.16-beta sources for MyISAM / InnoDB
For MyISAM / InnoDB
./configure --prefix=/usr/local/mysqltest/mysql-<RELEASE> --with-innodb
Method of benchmark:
1. Prepare table with 1,000,000 records (about 350Mb of data on disk)
2. Run each query for 1, 4, 16, 64, 128, 256, 512 concurrent threads.
3. For each thread perform a warm-up run (duration 180 sec), and then
run three effective runs (duration of each is 60 sec).
As the final result we get a maximal result of three runs.
The raw numbers are available here:
(Note: This benchmark is synthetic micro benchmarks focusing on particular simple data access patterns. Results for your workload are likely to be different.)
There are interesting results I want to show graphics with comments
Query: SELECT name FROM $tableName WHERE id = %d
Query: SELECT name FROM $tableName WHERE country_id = %d
Query: SELECT name FROM $tableName WHERE country_id = %d LIMIT 5
The same query as previous but with LIMIT clause.
Query: SELECT state_id FROM $tableName WHERE country_id = %d
This query is similar to previous READ_KEY_POINT with only different the values of accessed column is stored in key.
Interesting InnoDB scalability is pretty bad for this query. The result with 4 thread is worse then with 1 thread on both 4CPU and 8CPU boxes
Query: SELECT state_id FROM $tableName WHERE country_id = %d LIMIT 5
The previous query but with LIMIT.
Query: SELECT id FROM $tableName WHERE id = %d
Simple but very quick query to retrieve value from PK.
Query: SELECT min(dob) FROM $tableName WHERE id between %d and %d
Access by range of PK values.
PBXT is much worse here because of InnoDB clustering by primary key. InnoDB does not need an extra lookup to access to data.
Query: SELECT count(id) FROM $tableName WHERE id between %d and %d
The same comment as for previous query.
Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d
Interesting PBXT is better here by 50-80%
Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50
Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d
Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50
Query: SELECT min(dob) FROM $tableName
The hardest query performs a scan of all million rows.
In general PBXT looks pretty good. To be honest I did not expect such results. The one problem comparing to InnoDB is the access to data by Primary KEY, as I understand PBXT will not support clustering and will not able to reach InnoDB results, but I hope PBXT team will improve performance to make the difference a bit better.
Unfortunately InnoDB shows bad scalability for 8CPU box, in many cases the transition from 16 threads to 64 is too hard for InnoDB even with all scalability fixes exist in 5.1.16 tree. I think InnoDB team should solve this problem, I do not know Oracle plans about.
It is also interesting to check PBXT performance on INSERT/UPDATE queries taking into account PBXT way to handle such queries, I hope I will be able to test it soon.