At Zalando we run PostgreSQL at scale: a few hundred database clusters in sizes from a few megabytes up to 10 terabytes of data. What is a bigger challenge than running a high-OLTP multi-terabyte PostgreSQL cluster? It is the migration of such a cluster from the bare-metal data center environment to AWS.
There were multiple problems to solve and questions to answer:
* Which instance type to choose: i3 with ephemeral volumes or m4/r4 + EBS volumes?
* Should we give Amazon Aurora a try?
* There is no direct connection from AWS to the data-center. How to build a replica on AWS and keep it in sync if VPN is not an option?
* The database is used by a few hundred employees for ad-hoc queries; ideally, they should retain access through the old connection url.
* How to backup such a huge DB on AWS?
* We should be able to switch back to the data-center if something goes wrong.
In this talk I am going to provide a detailed account of how we managed to successfully solve all these problems.