At Dropbox we treat data as sacred and we do everything to protect it well. We don't let a single transaction slide not even in case of a disaster. Our requirements to the backups are to be able to recover to any point in time without the slightest chance to lose any single transaction. This certainly brings new challenges to the backup infrastructure.
In this talk I will show you:
- How Dropbox backs up petabytes of data
- How is it possible to schedule thousands of independent backups jobs
- How binary logs are streamed from all hosts in a cluster, processed and merged into a single stream of events that is used for point in time recovery, that way the binary log stream is not affected by the issues of any single node
Karoly have years of experience implementing database automation systems at large scale. Currently he is member of the Dropbox databases team, managing 1000s of machines. Before Dropbox he was responsible for the database management framework of Booking.com.