Now that Database-as-a-service (DBaaS) is in high demand, there is one question regarding AWS services that cannot always be answered easily : When should I use Aurora and when RDS MySQL?
DBaaS cloud services allow users to use databases without configuring physical hardware and infrastructure, and without installing software. I’m not sure if there is a straightforward answer, but when trying to find out which solution best fits an organization there are multiple factors that should be taken into consideration. These may be performance, high availability, operational cost, management, capacity planning, scalability, security, monitoring, etc.
There are also cases where although the workload and operational needs seem to best fit to one solution, there are other limiting factors which may be blockers (or at least need special handling).
In this blog post, I will try to provide some general rules of thumb but let’s first try to give a short description of these products.
What we should really compare is the MySQL and Aurora database engines provided by Amazon RDS.
An introduction to Amazon RDS
Amazon Relational Database Service (Amazon RDS) is a hosted database service which provides multiple database products to choose from, including Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. We will focus on MySQL and Aurora.
With regards to systems administration, both solutions are time-saving. You get an environment ready to deploy your application and if there are no dedicated DBAs, RDS gives you great flexibility for operations like upgrades or backups. For both products, Amazon applies required updates and the latest patches without any downtime. You can define maintenance windows and automated patching (if enabled) will occur within them. Data is continuously backed up to S3 in real time, with no performance impact. This eliminates the need for backup windows and other, complex or not, scripted procedures. Although this sounds great, the risk of vendor lock-in and the challenges of enforced updates and client-side optimizations are still there.
So, Aurora or RDS MySQL?
Amazon Aurora is a relational, proprietary, closed-source database engine, with all that that implies.
RDS MySQL is 5.5, 5.6 and 5.7 compatible and offers the option to select among minor releases. While RDS MySQL supports multiple storage engines with varying capabilities, not all of them are optimized for crash recovery and data durability. Until recently, it was a limitation that Aurora was only compatible with MySQL 5.6 but it’s now compatible with both 5.6 and 5.7 too.
So, in most cases, no significant application changes are required for either product. Keep in mind that certain MySQL features like the MyISAM storage engine are not available with Amazon Aurora. Migration to RDS can be performed using Percona XtraBackup.
For RDS products shell access to the underlying operating system is disabled and access to MySQL user accounts with the “SUPER” privilege isn’t allowed. To configure MySQL variables or manage users, Amazon RDS provides specific parameter groups, APIs and other special system procedures which be used. If you need to enable remote access this article will help you do so https://www.percona.com/blog/2018/05/08/how-to-enable-amazon-rds-remote-access/
Although Amazon RDS uses SSDs to achieve better IO throughput for all its database services, Amazon claims that the Aurora is able to achieve a 5x performance boost than standard MySQL and provides reliability out of the box. In general, Aurora seems to be faster, but not always.
For example, due to the need to disable the InnoDB change buffer for Aurora (this is one of the keys for the distributed storage engine), and that updates to secondary indexes must be write through, there is a big performance penalty in workloads where heavy writes that update secondary indexes are performed. This is because of the way MySQL relies on the change buffer to defer and merge secondary index updates. If your application performs a high rate of updates against tables with secondary indexes, Aurora performance may be poor. In any case, you should always keep in mind that performance depends on schema design. Before taking the decision to migrate, performance should be evaluated against an application specific workload. Doing extensive benchmarks will be the subject of a future blog post.
Talking about underlying storage, another important thing to take into consideration is that with Aurora there is no need for capacity planning. Aurora storage will automatically grow, from the minimum of 10 GB up to 64 TiB, in 10 GB increments, with no impact on database performance. The table size limit is only constrained by the size of the Aurora cluster volume, which has a maximum of 64 tebibytes (TiB). As a result, the maximum table size for a table in an Aurora database is 64 TiB. For RDS MySQL, the maximum provisioned storage limit constrains the size of a table to a maximum size of 16 TB when using InnoDB file-per-table tablespaces.
Replication is a really powerful feature of MySQL (like) products. With Aurora, you can provision up to fifteen replicas compared to just five in RDS MySQL. All Aurora replicas share the same underlying volume with the primary instance and this means that replication can be performed in milliseconds as updates made by the primary instance are instantly available to all Aurora replicas. Failover is automatic with no data loss on Amazon Aurora whereas the replicas failover priority can be set.
An explanatory description of Amazon Aurora’s architecture can be found in Vadim’s post written a couple of years ago https://www.percona.com/blog/2015/11/16/amazon-aurora-looking-deeper/
The architecture used and the way that replication works on both products shows a really significant difference between them. Aurora is a High Availablity (HA) solution where you only need to attach a reader and this automatically becomes Multi-AZ available. Aurora replicates data to six storage nodes in Multi-AZs to withstand the loss of an entire AZ (Availability Zone) or two storage nodes without any availability impact to the client’s applications.
On the other hand, RDS MySQL allows only up to five replicas and the replication process is slower than Aurora. Failover is a manual process and may result in last-minute data loss. RDS for MySQL is not an HA solution, so you have to mark the master as Multi-AZ and attach the endpoints.
Both products can be monitored with a variety of monitoring tools. You can enable automated monitoring and you can define the log types to publish to Amazon CloudWatch. Percona Monitoring and Management (PMM) can also be used to gather metrics.
Be aware that for Aurora there is a limitation for the T2 instances such that Performance Schema can cause the host to run out of memory if enabled.
Aurora instances will cost you ~20% more than RDS MySQL. If you create Aurora read replicas then the cost of your Aurora cluster will double. Aurora is only available on certain RDS instance sizes. Instances pricing details can be found here and here.
Storage pricing may be a bit tricky. Keep in mind that pricing for Aurora differs to that for RDS MySQL. For RDS MySQL you have to select the type and size for the EBS volume, and you have to be sure that provisioned EBS IOPs can be supported by your instance type as EBS IOPs are restricted by the instance type capabilities. Unless you watch for this, you may end up having EBS IOPs that cannot be really used by your instance.
For Aurora, IOPs are only limited by the instance type. This means that if you want to increase IOPs performance on Aurora you should proceed with an instance type upgrade. In any case, Amazon will charge you based on the dataset size and the requests per second.
That said, although for Aurora you pay only for the data you really use in 10GB increments if you want high performance you have to select the correct instance. For Aurora, regardless of the instance type, you get billed $0.10 per GB-month and $0.20 per 1 million requests so if you need high performance the cost maybe even more than RDS MySQL. For RDS MySQL storage costs are based on the EBS type and size.
Percona provides support for RDS services and you might be interested in these cases studies:
- Lookout Uses Percona’s Cloud Expertise to Reduce Footprint and Maintain Uptime
- Madwire Achieves Performance Assurance for Amazon RDS Aurora Through Percona’s Database Audit and Consultancy Services
When a more fully customized solution is required, most of our customers usually prefer the use of AWS EC2 instances supported by our managed services offering.
- If you are looking for a native HA solution then you should use Aurora
- For a read-intensive workload within an HA environment, Aurora is a perfect match. Combined with ProxySQL for RDS you can get a high flexibility
- Aurora performance is great but is not as much as expected for write-intensive workloads when secondary indexes exist. In any case, you should benchmark both RDS MySQL and Aurora before taking the decision to migrate. Performance depends much on workload and schema design
- By choosing Amazon Aurora you are fully dependent on Amazon for bug fixes or upgrades
- If you need to use MySQL plugins you should use RDS MySQL
- Aurora only supports InnoDB. If you need other engines i.e. MyISAM, RDS MySQL is the only option
- With RDS MySQL you can use specific MySQL releases
- Aurora is not included in the AWS free-tier and costs a bit more than RDS MySQL. If you only need a managed solution to deploy services in a less expensive way and out of the box availability is not your main concern, RDS MySQL is what you need
- If for any reason Performance Schema must be ON, you should not enable this on Amazon Aurora MySQL T2 instances. With the Performance Schema enabled, the T2 instance may run out of memory
- For both products, you should carefully examine the known issues and limitations listed here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.KnownIssuesAndLimitations.html and here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.AuroraMySQL.html