MySQL-Memcached or NOSQL Tokyo Tyrant – part 3Matt Yonkovit
This is part 3 of our series.Â In part 1 we talked about boosting performance with memcached on top of MySQL, in Part 2 we talked about running 100% outside the data with memcached, and now in Part 3 we are going to look at a possible solution to free you from the database.Â The solution I am going to discuss here is Tokyo Cabinet and Tyrant.
I am not going to give you a primerÂ or Tutorial on Tyrant and Cabinet, there are plenty of these out there already.Â Instead I want to see what sort of performance we can see compared to MySQL and Memcached, and later on other NoSQL solutions.Â Tokyo actually allows you to use several types of databases that are supported, there are hash databases which are very similar to memcached, a table database which is similar to your classic database tables where you can add a where clause and search individual columns, and a ton more “database options”Â beyond just those two.Â Again my goal is to not make this a Tokyo Tyrant tutorial but rather show one potential role it can play.
So if we can get performance similar to memcached with Tokoyo Tyrant when using disk based hash tables it would be a compelling replacement for our application here.Â It should provide the interface and the same access we saw in memcached but with disk persistence. So let’s look at the numbers:
Tyrant’s disk based hash was almost 2x faster then combining memcached and MySQL, and about 20% slower then the all memory memcached approach.Â So for this particular application I would have been much better off not storing my data in MySQL and instead looking outside the database for an answer.Â Now sure there are other reasons you may want to keep data in the database… but I am trying to get you to think about your application and if those reasons are really valid.Â Helping clients pick the right solution is one of the things we do here at Percona.Â If an application requires a database great, but if there is a better solution we want to suggest it.Â It’s our goal to make your application perform optimally.
Finally, one concern you have to have is the scalability of your storage solution.Â As load, number of threads, and data size increases how does performance differ or change?Â One knock on Tokyo -vs- Memcached is Tokyo is not distributed by default.Â Now that’s not to say we could not shard it based on a hash, or even build an api with the capability built in ( or use the memcached clients which works! )…Â but native support is lacking.Â It does support replication which could make some rather interesting architectures in the future.
So lets look at some scalability benchmarks, my server resources are rather limited but I thought I should try throwing more threads and work at the server until it hit its limit and fell over dead.Â It’s interesting to see the number of transactions that occur with a given number of threads.Â let’s look at some of these:
As expected the smaller buffer pool struggled ( why a smaller buffer pool?Â This simulates a much larger data set.Â A BP of 256M with 1GB of data, can give similar performance to 20GB of data and a 5GB BP ).Â So with 256M BP and 4GB of memcached we were well off the numbers we hit with a 4GB BP+4Gb of memcached ( which is expected ). Â Adding more threads even up to 128 threads increased overall throughput but my load average on the server hit 40 and my CPU was pegged.Â At 128 threads I was pegging out my CPU across the board.Â Also interesting is I started to hit bottlenecks in MySQL/Innodb when I had enough memory but I increased the threads from 64 to 128.Â As time permits I should revisit this and look at increased datasets, and look for area’s where Tyrant may stumble a bit.
Bottom line given a specific application and data pattern sometimes a relational database is not the appropriate place for storing data.Â A tool like Tokyo Tyrant may not be for everyone or every application, but neither is a relational database.Â Before building your next application try and understand whether an RDBMS is really needed or not.
How did I do these tests:
The above number were run with 32 Threads, Tyrant was started with 8 threads and 128M of memory,Â memached was started with 16 threads ( 1.4 memcached ), mysql was 5.1 XtraDB.Â Each environment had 2 tables each with 2 million rows.Â The data was identical. memcached and Tyrant stored a comma delimited string to represent the row.Â Â Mysql was running with 256M allocated to the innodb buffer unless otherwise noted.
What’s next?Â Well next I am going to try and continue this series by exploring and benchmarking other NOSQL options and comparing them to database based solutions.Â I think showing the performance of a couple of different Tokyo database formats would also be interesting.Â What other solutions are people interested in?Â I know I have gotten a lot of requests for cassandra #’s, but what else?Â Drop a comment and let me know!