TPCC-Like Workload for Sysbench 1.0

March 5, 2018

Author

Vadim Tkachenko

Benchmarks

Insight for DBAs

MySQL

Share this Post:

In this post I’ll look at some of our recent work for benchmark enthusiasts: a TPCC-like workload for Sysbench (version 1.0 or later).

Despite being 25 years old, the TPC-C benchmark can still provide an interesting intensive workload for a database in my opinion. It runs multi-statement transactions and is write-heavy. We also decided to use Sysbench 1.0, which allows much more flexible LUA scripting that allows us to implement TPCC-like workload.

For a long time, we used the tpcc-mysql (https://github.com/Percona-Lab/tpcc-mysql) tool for performance evaluations of MySQL and Percona Server for MySQL, but we recognize that the tool is far from being intuitive and simple to use. So we hope the adaptation for Sysbench will make it easier to run.

Although we are trying to mimic the TPC-C standard guidance, there are some major variations we decided to introduce.

First, we do not use fully random text fields. These are hard to compress, and we want to be able to evaluate different compression methods in InnoDB and MyRocks.

Second, we allow you to use multiple table sets, compared to the standard one set of nine tables. The reason is that we want to test workloads on multiple tables and to somewhat emulate SaaS environments, where multiple clients share the same database.

So, there is a DISCLAIMER: this benchmark script was not validated and certified by the TPC organization. The results obtained can’t be named as TPC-C results, and the results are not comparable with any official TPC-C results: http://www.tpc.org/information/results_spreadsheet.asp

How to run the benchmark:

We tried to make it as easy as possible to run the benchmark. You still need to take the following steps:

1. Make sure you have Sysbench 1.0+ properly installed

1. Get our scripts, located at https://github.com/Percona-Lab/sysbench-tpcc

1. Prepare the dataset

1. Run

The command line might look like this:

./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --threads=20 --tables=10 --scale=100 prepare

1	./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --threads=20 --tables=10 --scale=100 prepare

Where --scale is the number of warehouses, and --tables is the number of tables sets.

As a rough estimation, 100 warehouses with 1 table set produces about 10GB of data in non-compressed InnoDB tables (so 100 warehouses with 10 table sets gives about 100GB).

The nice thing about Sysbench is that it can load data in parallel (using N --threads). It also allows some extra options. For example, for MyRocks:

./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbr --threads=20 --tables=10 --scale=100 --use_fk=0 
--mysql_storage_engine=rocksdb --mysql_table_options='COLLATE latin1_bin' --trx_level=RC prepare

1 2	./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbr --threads=20 --tables=10 --scale=100 --use_fk=0 --mysql_storage_engine=rocksdb --mysql_table_options='COLLATE latin1_bin' --trx_level=RC prepare

As MyRocks does not support Foreign Keys, so --use_fk=0. also MyRocks in Percona Server for MySQL does not support Repeatable-Read, so we use READ-COMMITTED (--trx_level=RC). MyRocks also requires a binary collation for string fields in indexes (--mysql_table_options='COLLATE latin1_bin').

To run the benchmark, execute:

./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 
run

1 2	./tpcc.lua --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-db=sbt --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 run

We hope a TPCC-like workload for Sysbench will be helpful for database performance evaluations. Now that Sysbench includes support for PostgreSQL, these TPCC-like benchmarks should allow for more consistent performance comparisons.

Happy benchmarking!

Unexpected application downtown can significantly impact your bottom line. Fortunately, Point-In-Time-Recovery (PITR) is always an option for PostgreSQL. With it, you can undo an accidental change and not have to worry about losing all the data you’ve added to your database since your last full backup. What’s more, PostgreSQL can be configured to manage PITR strategies more effectively. For more information, read our white paper: Efficient Point-in-Time Recovery in PostgreSQL. To learn how to configure PostgreSQL for high availability, check out PostgreSQL High Availability.

0 0 votes

Article Rating

Subscribe

1 Comment

Oldest

Newest Most Voted

fan shen

7 years ago

Dear Mr. Tkachenko,
Follow your steps i have set up the sysbench1.1 test environment. trx_level=RC also needed in run script to avoid the gap lock error.

For tpcc test, after add some performance patameters in my.cnf, all transaction percentage of test result are OK, only few of response time (at least 90% passed) NG status. I will try to solve the left NG.

0

Reply