October 24, 2014

How fast can MySQL Process Data

Reading Barons post about Kickfire Appliance and of course talking to them directly I learned a lot in their product is about beating data processing limitations of current systems.

This raises valid question how fast can MySQL process (filter) data using it current architecture ?
I decided to test the most simple case – what if we take the in memory table with very narrow row and run simple query which needs to do simple filtering – how many rows per second it will be able to do?

So we get 0.15 sec to scan about 1.000.000 rows which gives us peak filtering speed of about 7M rows/sec on this Intel(R) Xeon(R) CPU 5130 @ 2.00GHz CPU. This number is per core. In theory this box which has 4 cores should be able to do up to 4 times more, though in practice scaling factor is less of course.

Interesting enough if we get bigger table (so smaller portion of table will fit in CPU cache) the filtering speed stays about the same:

To check completely in-cache scenario I created a table with just 10000 rows and wrote little stored procedure to make timing easier:

So we can get 10Mil rows filtered in 1.37 sec giving us again a bit over 7M rows/sec.

This CPU is 2Ghz so we get some 280 CPU Cycles per filtered row, which is not that bad considering abstraction of storage engine which requires “row by row” processing which means function calls for each row.

Lets see if we do some row function on those 10.000.000 rows (to keep it simple)
(In reality I did multiple runs to get accurate results, but I show only one here)

So as you see as we add some math the row scan speed is significantly affected.

I also decided to see how longer rows affect performance and created the following table:

So with a bit longer rows we instantly get 4.3M rows per second. And now if we look at the memory amount consumed by table we can see the filtering speed is about 600MB/sec which is surely small fraction of what memory bus capacity of this system can deliver.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. anggit says:

    Sorry, correcty the query above

    SELECT COUNT(*) FROM table WHERE id = x AND cat_id = y;

  2. anggit says:

    Hello.. I have query:

    SELECT COUNT(*) WHERE id = x AND cat_id = y

    It’s very slow at 1 million rows, even id and cat_id having index. Please how to make it so fast, please I need the answer. Sorry about my english, I’am fom Asia.

  3. ries says:

    I see you use the ENGINE=MEMORY. This also means that the count(*) you are doing is not transactional save, this is what I understand…
    Which in return means that if you have multiple deletes/inserts/updates going on on your table you are never sure if the count(…) value you get is correct at that point in time??
    Also I do feel doing doing a count(*) is a weird form of ‘testing’ I think what you do a bit below doing the aggregates is more valid because then MySQL really needs to scan the table rather then looking at some index (if that is optimized).

    I am also wondering why you didn’t use an innodb storage engine and using the MEMORY storage type. I think every serious database user want’s to have something that is transactional save, right??

    just my two cents

Speak Your Mind

*