Insert Random Data into Tables Using mysql_random_data_load

Insert Random Data into Tables Using mysql_random_data_load

PREVIOUS POST
NEXT POST

Insert Random Data into TablesIn this short blogpost, we’ll show you how to use the mysql_random_data_load tool to insert random data into tables. This is a great aide in testing when you have empty tables and need them to be populated with data. We’ve all done it manually (the INSERT INTO … VALUES … way), but that is surely a waste of time. It can add up to a lot if you need to test tables with many columns, or even worse, with foreign keys.

Before saying anything else, we would like to mention that this tool is still under development, and that it will insert random data into tables. Be aware if running it in a production environment!

mysql_random_data_load is a tool created by PerconLabs.

Note: PerconaLabs and Percona-QA are open source GitHub repositories for unofficial scripts and tools created by Percona staff. While not covered by Percona support or services agreements, these handy utilities can help you save time and effort.

Now, back to the disclaimer-free zone, the tool in question is named mysql_random_data_load, and can be currently found in the Percona Lab GitHub repository:

https://github.com/Percona-Lab/mysql_random_data_load

The README.md file has a lot of information on it already, so feel free to pause this reading and come back after you’ve skimmed through it. If you are eager to start using it, though, we give you the crash course below.

We offer two binaries in the releases tab, one for Darwin and another one for Linux. We will show examples of the linux_amd64 build here, but they are the same for the darwin_amd64 one.

First, get the latest build (0.1.6 as of this writing):

We will use the sakila database schema for the tests, so the second and final step is to download and install it.

Ok, we are now ready to test inserting some rows with random data! Let’s try populating the actor table first. Since its primary key is a SMALLINT, we have a maximum of 65535 rows.

Let’s see what the data looks like:

Adding data to tables with Foreign Keys

Now that we have the actor table filled with data, let’s try the foreign key functionality, which is one of the tool’s biggest strengths (since coming up with data that makes sense in this context is even more laborious).

If we look at the sakila structure, a good candidate for trying this is film_actor. Apart from the actor table, it also needs the film table to have data (due to FK constraints), which in turn needs the language table to have data. Easy enough!

In this last example, we already used the foreign key functionality without even modifying the command we used for “regular” tables. You can disregard the notes about the triggers in this case.

We are now ready to fill the film_actor table. Let’s see two new variables: –bulk-size and –max-fk-samples. The former can help us with creating INSERTs with as many rows as we want, which helps expedite the process if many are needed. The latter chooses the number of samples from the film and actor tables, to use for the FK constraints. Note that these two variables already have defaults, so we can choose not to include them and the tool will still work.

A quick test shows that things went well:

Specifying queries per second

Starting from version 0.1.6, there is a new experimental feature to insert rows at a specific rate: –qps <queries per second>. This option could be used when you need not only data but also at a specific rate (for example, to test a monitoring system like PMM). Since the total number of rows to be inserted is still in effect, this option makes more sense when –bulk-size=1 is used.

Summary

As we have seen, it is really easy to insert random data into tables so you can later use them in your tests. There is still more work to do on the tool, but we have found that it is already of great help. We would love to hear back from you on it either here or, even better, via the issues GitHub page directly. If there are any use-cases or functionality you can think of, let us know!

PREVIOUS POST
NEXT POST

Share this post

Comments (5)

  • hasan.ovuc Reply

    I got the error message in below when I tried.

    2018/01/12 15:42:19 Cannot set time zone to UTC: Error 1045: Access denied for user ”@’127.0.0.1′ (using password: NO)

    January 12, 2018 at 7:45 am
    • Agustin G Reply

      Hi Hasan,

      What exact command are you using? From the message, I believe you are not setting the password correctly (or at all?). You can check credentials by using the mysql client, and then using the same in the mysql_random_data_load tool:

      shell> mysql –host= –user= –password=

      A better platform for these questions is our forums: https://www.percona.com/forums/ or directly over the tool’s issues page if you think this could be a bug: https://github.com/Percona-Lab/mysql_random_data_load/issues. Let me know if you create a new thread, and we can continue over there.

      January 12, 2018 at 10:49 am
      • Hasan Ovuc Reply

        Hi Agustin,

        I ask the question in the forum, later. I think it is not a bug. Only just something going wrong, maybe db configuration.

        Regards,
        Hasan

        January 12, 2018 at 3:37 pm
  • sbester Reply

    A true feature would be to eliminate the need for this program totally. In other words, let the program output the most compact SQL to populate any given table with X rows, without needing itself.. Perfect for standalone testcases….. 😉

    January 12, 2018 at 2:40 pm

Leave a Reply