Despite being 25 years old, the TPC-C benchmark can still provide an interesting intensive workload for a database in my opinion. It runs multi-statement transactions and is write-heavy. We also decided to use Sysbench 1.0, which allows much more flexible LUA scripting that allows us to implement TPCC-like workload.
For a long time, we used the tpcc-mysql (https://github.com/Percona-Lab/tpcc-mysql) tool for performance evaluations of MySQL and Percona Server for MySQL, but we recognize that the tool is far from being intuitive and simple to use. So we hope the adaptation for Sysbench will make it easier to run.
Although we are trying to mimic the TPC-C standard guidance, there are some major variations we decided to introduce.
First, we do not use fully random text fields. These are hard to compress, and we want to be able to evaluate different compression methods in InnoDB and MyRocks.
Second, we allow you to use multiple table sets, compared to the standard one set of nine tables. The reason is that we want to test workloads on multiple tables and to somewhat emulate SaaS environments, where multiple clients share the same database.
So, there is a DISCLAIMER: this benchmark script was not validated and certified by the TPC organization. The results obtained can’t be named as TPC-C results, and the results are not comparable with any official TPC-C results: http://www.tpc.org/information/results_spreadsheet.asp
How to run the benchmark:
We tried to make it as easy as possible to run the benchmark. You still need to take the following steps:
- Make sure you have Sysbench 1.0+ properly installed
- Get our scripts, located at https://github.com/Percona-Lab/sysbench-tpcc
- Prepare the dataset
The command line might look like this:
./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --threads=20 --tables=10 --scale=100 prepare
--scale is the number of warehouses, and
--tables is the number of tables sets.
As a rough estimation, 100 warehouses with 1 table set produces about 10GB of data in non-compressed InnoDB tables (so 100 warehouses with 10 table sets gives about 100GB).
The nice thing about Sysbench is that it can load data in parallel (using N
--threads). It also allows some extra options. For example, for MyRocks:
./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbr --threads=20 --tables=10 --scale=100 --use_fk=0
--mysql_storage_engine=rocksdb --mysql_table_options='COLLATE latin1_bin' --trx_level=RC prepare
As MyRocks does not support Foreign Keys, so
--use_fk=0. also MyRocks in Percona Server for MySQL does not support Repeatable-Read, so we use READ-COMMITTED (
--trx_level=RC). MyRocks also requires a binary collation for string fields in indexes (
To run the benchmark, execute:
./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100
We hope a TPCC-like workload for Sysbench will be helpful for database performance evaluations. Now that Sysbench includes support for PostgreSQL, these TPCC-like benchmarks should allow for more consistent performance comparisons.
You May Also Like
Sysbench-tpcc also works with PostgreSQL. Read our blog for steps on how to setup PostgreSQL to perform optimally for the workload benchmark.
Unexpected application downtown can significantly impact your bottom line. Fortunately, Point-In-Time-Recovery (PITR) is always an option for PostgreSQL. With it, you can undo an accidental change and not have to worry about losing all the data you’ve added to your database since your last full backup. What’s more, PostgreSQL can be configured to manage PITR strategies more effectively. For more information, read our white paper: Efficient Point-in-Time Recovery in PostgreSQL. To learn how to configure PostgreSQL for high availability, check out PostgreSQL High Availability.