Dissection of EC2 / EBS volume

So during preparation of XtraDB template for EC2 I wanted to understand what IO characteristics we can expect from EBS volume ( I am speaking about single volume, not RAID as in my previous post). Yasufumi did some benchmarks and pointed me on interesting behavior, there seems several level of caching on EBS volume.

Let me show you. I did sysbench random read IO benchmark on files with size from 256M to 5GB with step 256M. And, as Morgan pointed me, I previously made first write, to avoid first-write penalty:

for reference script is:

And raw results (for m.large instance, though for m.xlarge it was similar) are available on page
https://spreadsheets.google.com/ccc?key=0AjsVX7AnrCYwdFlBVW9KWVJGUGFqeVdpUHY0Y0VXYXc&hl=en, see Sheet “256_5GB filesize”.

Results in graph are:

So can you see several levels of results
256M-1.25G , 1.5G – 2.25G, 2.5G + .

With 1.5G-2.25G we see performance comparable with RAID10 on 4 disks, and with
2.5G+ results are similar for single HDD performance.

So we may guess the schema of storage is

So running InnoDB on database bigger 2.5G, you may expect performance as from single HDD, and you may consider some RAID setup, see my previous post
EC2/EBS single and RAID volumes IO benchmark

Share this post

Comments (5)

  • wizardofcrowds Reply

    This is somewhat very similar to what I found but much more primitive way. Here is a thread on EC2 developer forum.


    AndrewC@AWS said “There is no throttling per se on EBS; however, some of the system components are shared resources. You may experience contention, which can reduce your performance from the theoretical maximum. In this particular case, your first set of writes are serviced by writing into a cache. Eventually, the cache is full and then you are bound by the throughput of the underlying disk arrays.”

    August 7, 2009 at 9:33 pm
  • peter Reply


    Indeed. I also have seen cache on EC2 EBS. I do not think however it is something like 4 disk RAID you mention – it is all shared infrastructure to start with. I’d expect there is some cache which have certain performance – note the response time you’re seeing are well below 5ms you would see from physical spinning drive.

    This is indeed challenge for “cloud” envinronment which is both shared as well as loosely specified – because you do not know how much cache you’re dealing with you do not know if workload you’re running is “cached” or not, this means you can’t predict how performance will drop with data growth.

    August 10, 2009 at 5:43 pm
  • Ijonas Kisselbach Reply

    Your link to your previous post on “EC2/EBS single and RAID volumes IO benchmark” doesn’t seem to work.

    August 16, 2009 at 12:41 am
  • Vadim Reply


    fixed, thanks!

    August 21, 2009 at 9:31 pm
  • Gabriela Reply

    Right away I am going to do my breakfast, when having my breakfast coming
    again to read additional news.

    July 11, 2014 at 5:12 am

Leave a Reply