Preparing Choosing Storage Systems for MySQL talk for Percona Live in Washington,DC I ran into great paper called Sane SAN 2010 by James Morle from Scale Abilities – and Oracle consulting company. It is worth to read for variety of reason yet for this post I wanted to mention what James calls “Busy” Oracle database application when it comes to IO consumption:
It is applications either using Over 10.000 IOs Per second (latency sensitive) or using Over 500MB/s bandwidth (bandwidth sensitive). I wonder how many of MySQL Users are running applications of these scale, for single MySQL Server ? Please comment if you do.
I believe we see more of applications busy from IOPs numbers, especially now when we easily can get Flash storage which allows to get well over 10K IOPS without going to external storage. 500MB/sec would be exceptional for traditional MySQL as MySQL optimizer with its nested loops joins does not favor sequential scans even for analytic queries executions. Also lack of parallel query execution makes limits how much IO full table scan queries can consume.
If we look at single Server these numbers are quite high however if we look at total numbers for “sharded” MySQL the numbers are not that high – many what would be considered medium scale MySQL installations would get there. With MySQL we seems to prefer to run more low end server and storage configurations.
I believe there are several reasons for that. First because MySQL can be used for free we do not need to optimize the hardware for license costs. With Oracle license costs based on number of CPUs we want to make sure these CPUs are busy and as such need to run storage powerful enough to achieve this. In MySQL world we very often run our CPUs very underutilized while storage is completely used.
Second reason I believe is ability of MySQL to use powerful system (including powerful storage) – Is quite limited with single thread replication as well as other restrictions. As Vadim points out in this and this whitepapers you can get a lot better performance with multiple MySQL instances on powerful storage. When you move to multiple instances and you have to “shard” your data in some way anyway there is no difference whenever you run these instances on single Server or many and as such you can use less powerful servers if only it is more cost effective.
Finally at MySQL we often start “sharding” or splitting data other way because operationally working with large table and data sizes in MySQL can be painful, lacking online operations to alter table, defragmentation etc. Yes there are replication based and trigger based techniques to archive this but it is not the same thing. How many of us are running 5TB per MySQL instance ? 20TB ?
Over 50TB ?
I believe this is to continue to be significant difference between MySQL and Oracle deployments, yet as MySQL is getting better and better handling large scale systems (both in terms of performance and operationally) and as MySQL is making its way into Enterprise Developments I expect us to see more and more Large Single Instance developments as it often allows applications to be a lot more simple.