One question which comes up very often is when one should use SAN with MySQL, which is especially popular among people got used to Oracle or other Enterprise database systems which are quite commonly deployed on SAN.
My question in such case is always what exactly are you trying to get by using SAN ?
Depending on the goals the advice could be a lot different.
Manageability SAN allows you to manage storage easily, compared to directly attach storage. If you have one box which starts to require more IO you can have more resources allocated to it. This is pattern which is somewhat typical for Enterprise managing a lot of relatively small MySQL installations or managed hosting provider. For large MySQL web MySQL installations using Sharding or replication IO needs are typically rather predictable. There are also often nice monitoring tools available to see disk utilization latencies queues etc. There is however also a downside compared to directly attached storage in terms of performance management – the SAN is shared resource (has more or less shared parts) so it is possible one application to impact other which means you can’t always analyze the local system performance independently from anything else.
Performance This is perhaps the most interesting one. I constantly see SANs sold as magic equipment which magically going to solve all performance problems, with magically means nobody knows how exactly. I would suggest always to question yourself where do you expect these performance gains to come from ? If we’re speaking about purely disk based SAN (no Flash) the drives are same drives you will see in directly attached storage and each can do only so many IO requests per second. True the SAN can have much more hard drives than directly attached storage but usually also at much higher prices per drive. The next possible advantage is software – can SAN have some very smart software instead of simple RAID you use with directly attach storage which can magically improve performance ? There are cases when it can but you surely can ask this question. For example if you share same physical drives among applications which have different peak usage patterns or something similar there can be an advantage. The question is again if it is high enough to pay for the price premium ? The third one is caching. SAN can have a lot of cache though servers can typically have more. If you can afford SAN you should afford 128GB of memory or so on the server too which will make read caching more efficient in its memory while write buffering can too be done by MySQL and local RAID (with BBU) quite efficiently. It is worth to say there is a benefit of read caching on SAN – if you have MySQL/server restart you may have warmup being quite shorter than in case of local storage.
It is also worth to note SAN does not only have advantages compared to directly attached storage but also downsides – SAN typically has better throughput (because of larger amount of drives) but longer latency because extra processing (and extra trip) involved. This in particular hurts log writes which are very latency critical.
Now what is about MySQL/Innodb specifics ? First, in MySQL if you’re looking for durable transactions the log write latency is triple important compared to other database systems. MySQL has to do more than one fsync() for the log because it internally implements XA to synchronize with binary log and also if you’re looking for maximal data consistency (as SAN users often are) you better have binary log flushed on commit too. MySQL also has broken group commit (which we have partial fix for) meaning concurrent transaction commits will need to be serialized.
Second, MySQL is often not able to submit a lot of outstanding IO requests which is needed to utilize SAN efficiently. It is especially problem if you’re running replication Slave as slave thread will issue single IO request most of the time making it sensitive to latency rather than throughput. Also you should plan on any queries executed having only one IO at the time if Innodb or MyISAM tables are used. There is read-ahead functionality but usually it is not able to drive number of outstanding request significantly. For write intensive workload you will have problem with number of outstanding writes too. Though for that we have a fix at least.
Scalability We need more IO or space than we can get from 6-8 build in drives… so lets do SAN is the story here. As I mentioned in Performance section you may be surprised performance is not “scaling” as much as you expected. You should also consider external directly attached storage which is cheaper alternative in most cases and is especially good if your IO needs are predictable, like you need 3TB of space per box – just external disk enclosure does it great. My Advice for Scaling IO capacity with SAN (this applies to directly attached storage too) is to really understand what you’re trying to scale and analyze things appropriately. I’ve seen in so many times this was a poor choice as it did not allow to scale (ie attempt to scale replication) or was a poor choice, with adding memory or SSD being a better choice.
High Availability Some people are got used to using SAN based active-passive clusters for availability purposes and they look to do the same with MySQL. This works though I do not think it is the best choice. SAN is just another object which can fail completely. Sure it is more reliable, same as server with multiple power supplies ECC memory kill-chip and RAID is more reliable than without these technologies but I always in my HA architectures it is a single “point” which can experience “failure”. Note even if hardware fully redundant the Firmware may have a glitch causing failure and data corruption (and this is not just a speculation, some clients really had it)
Even if you do not run any “Cluster” having SAN you can always “connect” the storage to another node – if server fails or if you just want to upgrade to bigger box, which can be more complicated with directly attached storage.
If you want SAN like high availability I would highly consider DRBD which can do storage level replication between directly attached volumes. Now with support for Dolphin Interconnect it can also offer very decent latency. It also have a benefit of having the system which you can actually split in two (ie for recovery purposes). Sure it needs double the storage but you get 2 copies of data too and using SAN you probably pay more than 2x premium anyway.
Yet another approach is to use MySQL replication with something as MMM or Flipper. I think this actually works best for most cases (unless async nature of MySQL replication is a showstopper) and it also solves the other big pain of semi-online MySQL/OS upgrades and Schema changes.
Backups SAN has number of advantages for operations (and I guess these are the guys which often push for it). It often would offer snapshots with low overhead allowing to take a backups conveniently. Also if you keep last uncompressed backup on SAN (or last snapshot) you may be able to minimize recovery time significantly by switching to that backup – no data copy will be required which can be significant concern to 1TB+ databases.
My Take: I view SAN as solution for niche circumstances when it comes to MySQL. It may be you need it but it is not silver bullet and best solution for all problems at once. When evaluating SAN you should also evaluate external directly attached storage, SSDs, Increasing system memory as well as sharding and replication as a scaling solutions.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.