Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Creating Custom Sysbench Scripts

April 25, 2019

Author

Matthew Boehm

Benchmarks

Insight for DBAs

Insight for Developers

Share this Post:

Sysbench has long been established as the de facto standard when it comes to benchmarking MySQL performance. Percona relies on it daily, and even Oracle uses it when blogging about new features in MySQL 8. Sysbench comes with several pre-defined benchmarking tests. These tests are written in an easy-to-understand scripting language called Lua. Some of these tests are called: oltp_read_write, oltp_point_select, tpcc, oltp_insert. There are over ten such scripts to emulate various behaviors found in standard OLTP applications.

But what if your application does not fit the pattern of traditional OLTP? How can you continue to utilize the power of load-testing, benchmarking, and results analysis with sysbench? Just write your own Lua script!

For those that want to jump ahead and see the full source, here you go.

Sysbench API

To start off, each Lua script you create must implement three core sysbench-Lua API functions. These are thread_init, thread_done, and event. You can read the comments in the code below for the meaning of each function and what is happening inside.

-- Called by sysbench one time to initialize this script
function thread_init()

  -- Create globals to be used elsewhere in the script

  -- drv - initialize the sysbench mysql driver
  drv = sysbench.sql.driver()

  -- con - represents the connection to MySQL
  con = drv:connect()
end

-- Called by sysbench when script is done executing
function thread_done()
  -- Disconnect/close connection to MySQL
  con:disconnect()
end

-- Called by sysbench for each execution
function event()

  -- If user requested to disable transactions,
  -- do not execute BEGIN statement
  if not sysbench.opt.skip_trx then
    con:query("BEGIN")
  end

  -- Run our custom statements
  execute_selects()
  execute_inserts()

  -- Like above, if transactions are disabled,
  -- do not execute COMMIT
  if not sysbench.opt.skip_trx then
    con:query("COMMIT")
  end
end

-- Called by sysbench one time to initialize this script

function thread_init()

-- Create globals to be used elsewhere in the script

-- drv - initialize the sysbench mysql driver

drv = sysbench.sql.driver()

-- con - represents the connection to MySQL

con = drv:connect()

end

-- Called by sysbench when script is done executing

function thread_done()

-- Disconnect/close connection to MySQL

con:disconnect()

end

-- Called by sysbench for each execution

function event()

-- If user requested to disable transactions,

-- do not execute BEGIN statement

if not sysbench.opt.skip_trx then

con:query("BEGIN")

end

-- Run our custom statements

execute_selects()

execute_inserts()

-- Like above, if transactions are disabled,

-- do not execute COMMIT

if not sysbench.opt.skip_trx then

con:query("COMMIT")

end

That’s all pretty simple and should function as a good template in your scripts. Now let’s take a look at the rest of the script.

Sanity checks and options

Now let’s get into the core code. At the top you’ll find the following sections:

if sysbench.cmdline.command == nil then
   error("Command is required. Supported commands: run")
end

sysbench.cmdline.options = {
  point_selects = {"Number of point SELECT queries to run", 5},
  skip_trx = {"Do not use BEGIN/COMMIT; Use global auto_commit value", false}
}

if sysbench.cmdline.command == nil then

error("Command is required. Supported commands: run")

end

sysbench.cmdline.options = {

point_selects = {"Number of point SELECT queries to run", 5},

skip_trx = {"Do not use BEGIN/COMMIT; Use global auto_commit value", false}

}

The first section is a sanity check to make sure the user actually wants to run this test. Other test scripts, mentioned above, support commands like prepare, run, and cleanup. Our script only supports run as the data we are using is pre-populated by our core application.

The second section allows us, the script writer, to let the user pass some options specific to our test script. In the code above, we can see an option for the number of SELECT statements that will be ran on each thread/iteration (default is 5) and the other option allows the user to disable BEGIN/COMMIT if they so desire (default is false). If you want more customization in your script, simply add more options. You’ll see how to reference these parameters later on.

The queries

Now it is time to define the custom queries we want to execute in our script.

-- Array of categories to be use in the INSERTs
local page_types = { "actor", "character", "movie" }

-- Array of COUNT(*) queries
local select_counts = {
  "SELECT COUNT(*) FROM imdb.title"
}

-- Array of SELECT statements that have 1 integer parameter
local select_points = {
  "SELECT * FROM imdb.title WHERE id = %d",
  "SELECT * FROM imdb.comments ORDER BY id DESC limit 10",
  "SELECT AVG(rating) avg FROM imdb.movie_ratings WHERE movie_id = %d",
  "SELECT * FROM imdb.users ORDER BY RAND() LIMIT 1"
}

-- Array of SELECT statements that have 1 string parameter
local select_string = {
  "SELECT * FROM imdb.title WHERE title LIKE '%s%%'"
}

-- INSERT statements
local inserts = {
  "INSERT INTO imdb.users (email_address, first_name, last_name) VALUES ('%s', '%s', '%s')",
  "INSERT INTO imdb.page_views (type, viewed_id, user_id) VALUES ('%s', %d, %d)"
}

-- Array of categories to be use in the INSERTs

local page_types = { "actor", "character", "movie" }

-- Array of COUNT(*) queries

local select_counts = {

"SELECT COUNT(*) FROM imdb.title"

}

-- Array of SELECT statements that have 1 integer parameter

local select_points = {

"SELECT * FROM imdb.title WHERE id = %d",

"SELECT * FROM imdb.comments ORDER BY id DESC limit 10",

"SELECT AVG(rating) avg FROM imdb.movie_ratings WHERE movie_id = %d",

"SELECT * FROM imdb.users ORDER BY RAND() LIMIT 1"

}

-- Array of SELECT statements that have 1 string parameter

local select_string = {

"SELECT * FROM imdb.title WHERE title LIKE '%s%%'"

}

-- INSERT statements

local inserts = {

"INSERT INTO imdb.users (email_address, first_name, last_name) VALUES ('%s', '%s', '%s')",

"INSERT INTO imdb.page_views (type, viewed_id, user_id) VALUES ('%s', %d, %d)"

}

The above code defines several arrays/lists of different queries. Why is this necessary? Later on in the code, we will have to parse each SQL statement and populate/replace the various parameters with randomly generated values. It would not do us any good to repeat the same SELECT * FROM fooTable WHERE id = 44 every time, now would it? Certainly not. We want to generate random numbers and have our queries select from the entire dataset.

Some queries have no parameters, some have integer-based, and some string-based. We will handle these differently below, which is why they are in different arrays above. This method also allows for future expansion. When you want to run additional queries within the script, just add another line to each array; no need to change any other code.

Parse and execute

The function below, execute_selects, will be called from the parent function, event, which we discussed earlier in the post. You can see for-loops for each of the three SELECT categories we created above. The comments inline should help explain what is happening. Note the use of the user-provided option –point-selects in the second loop below, which we created previously in the ‘Sanity and Options’ section.

function execute_selects()

  -- Execute each simple, no parameters, SELECT
  for i, o in ipairs(select_counts) do
    con:query(o)
  end
	
  -- Loop for however many queries the
  -- user wants to execute in this category
  for i = 1, sysbench.opt.point_selects do
	
    -- select random query from list
    local randQuery = select_points[math.random(#select_points)]
	
    -- generate random id and execute
    local id = sysbench.rand.pareto(1, 3000000)
    con:query(string.format(randQuery, id))
  end
	
  -- generate random string and execute
  for i, o in ipairs(select_string) do
    local str = sysbench.rand.string(string.rep("@", sysbench.rand.special(2, 15)))
    con:query(string.format(o, str))
  end

end

function execute_selects()

-- Execute each simple, no parameters, SELECT

for i, o in ipairs(select_counts) do

con:query(o)

end

-- Loop for however many queries the

-- user wants to execute in this category

for i = 1, sysbench.opt.point_selects do

-- select random query from list

local randQuery = select_points[math.random(#select_points)]

-- generate random id and execute

local id = sysbench.rand.pareto(1, 3000000)

con:query(string.format(randQuery, id))

end

-- generate random string and execute

for i, o in ipairs(select_string) do

local str = sysbench.rand.string(string.rep("@", sysbench.rand.special(2, 15)))

con:query(string.format(o, str))

end

Two more things to mention for this code. First, you will notice the use of sysbench.rand.pareto to generate a random number between 1 and 3,000,000. For our dataset, we know that each table referenced in all queries relating to WHERE id = ? has that many number of rows, at minimum. This is specific to our data. Your values will certainly be different. Second, notice the use of sysbench.rand.string, and string.rep. The string.rep segment will generate a string comprised of ‘@’ symbols, between 2 and 15 characters long. That string of ‘@’ symbols will then be passed to sysbench.rand.string, which will swap out each ‘@’ for a random alphanumeric value. For example, ‘@@@@@@’ could be changed to ‘Hk9EdC’ which will then replace the ‘%s’ inside the query string (string.format) and be executed.

Handle inserts

Our INSERT statements require values. Again, sysbench calls the function execute_inserts from event on each iteration. Inside execute_inserts, we generate some fake string data using built-in functions described above.

Those strings are then formatted into the SQL and executed.

function create_random_email()
  local username = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10)))
  local domain = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10)))
  return username .. "@" .. domain .. ".com"
end

function execute_inserts()

  -- generate fake email/info
  local email = create_random_email()
  local firstname = sysbench.rand.string("first-" .. string.rep("@", sysbench.rand.special(2, 15)))
  local lastname = sysbench.rand.string("last-" .. string.rep("@", sysbench.rand.special(2, 15)))

  -- INSERT for new imdb.user
  con:query(string.format(inserts[1], email, firstname, lastname))

  -- INSERT for imdb.page_view
  local page = page_types[math.random(#page_types)]
  con:query(string.format(inserts[2], page, sysbench.rand.special(2, 500000), sysbench.rand.special(2, 500000)))

end

function create_random_email()

local username = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10)))

local domain = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10)))

return username .. "@" .. domain .. ".com"

end

function execute_inserts()

-- generate fake email/info

local email = create_random_email()

local firstname = sysbench.rand.string("first-" .. string.rep("@", sysbench.rand.special(2, 15)))

local lastname = sysbench.rand.string("last-" .. string.rep("@", sysbench.rand.special(2, 15)))

-- INSERT for new imdb.user

con:query(string.format(inserts[1], email, firstname, lastname))

-- INSERT for imdb.page_view

local page = page_types[math.random(#page_types)]

con:query(string.format(inserts[2], page, sysbench.rand.special(2, 500000), sysbench.rand.special(2, 500000)))

end

Example run

$ sysbench imdb_workload.lua 
    --mysql-user=imdb --mysql-password=imdb 
    --mysql-db=imdb --report-interval=1 
    --events=0 --time=0 run

WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:
Number of threads: 1
Report intermediate results every 1 second(s)
Initializing random number generator from current time


Initializing worker threads...

Threads started!

[ 1s ] thds: 1 tps: 15.96 qps: 177.54 (r/w/o: 112.71/31.92/32.91) lat (ms,95%): 158.63 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 1 tps: 15.01 qps: 169.09 (r/w/o: 109.06/30.02/30.02) lat (ms,95%): 137.35 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 1 tps: 26.00 qps: 285.00 (r/w/o: 181.00/52.00/52.00) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 1 tps: 15.00 qps: 170.00 (r/w/o: 108.00/32.00/30.00) lat (ms,95%): 164.45 err/s: 0.00 reconn/s: 0.00

$ sysbench imdb_workload.lua

--mysql-user=imdb --mysql-password=imdb

--mysql-db=imdb --report-interval=1

--events=0 --time=0 run

WARNING: Both event and time limits are disabled, running an endless test

sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:

Number of threads: 1

Report intermediate results every 1 second(s)

Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 1s ] thds: 1 tps: 15.96 qps: 177.54 (r/w/o: 112.71/31.92/32.91) lat (ms,95%): 158.63 err/s: 0.00 reconn/s: 0.00

[ 2s ] thds: 1 tps: 15.01 qps: 169.09 (r/w/o: 109.06/30.02/30.02) lat (ms,95%): 137.35 err/s: 0.00 reconn/s: 0.00

[ 3s ] thds: 1 tps: 26.00 qps: 285.00 (r/w/o: 181.00/52.00/52.00) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00

[ 4s ] thds: 1 tps: 15.00 qps: 170.00 (r/w/o: 108.00/32.00/30.00) lat (ms,95%): 164.45 err/s: 0.00 reconn/s: 0.00

And there we have it! Custom queries specific to our application and dataset. Most of the sysbench parameters are self-explanatory, but let me talk about –report-interval=1 which shows statistics every 1 second. Normally sysbench does not output stats until the end of the run, however, the example execution will run forever (–events=0 –time=0) so we need stats to show all the time. You can adjust the parameters to your liking. For instance, if you only want to run a test for 5 minutes, set –events=0 –run-time=300.

Conclusion

Sysbench is a very well designed application that allows you to load-test your MySQL instances using pre-defined and custom queries. Using the Lua scripting language, you can create just about any scenario to fit your needs. The above is just one example that we use within Percona’s Training and Education department. It is by no means an exhaustive example of all of the capabilities of sysbench-Lua.

—
Photo by Lachlan Donald on Unsplash