RocksDB for the Cloud
RocksDB is used extensively by applications on the cloud. The stock RocksDB library does not provide for durability of data in the case of machine failures. This means that applications typically have to implement their own mechanisms for replicating data. On the other-hand, the AWS cloud environment provides services that allow elegant durability and replication of data. This talk describes how RocksDB-cloud can leverage these cloud-services to achieve data durability in the case of machine failures. We zoom into that design aspect of RocksDB, called the 'RocksDB Environment' that makes it particularly easy to port it various environments, including AWS and Google Cloud. We describe how we make RocksDB data files be stored in AWS-S3 system and its write-ahead-logs in AWS-Kinesis. We discuss how AWS-S3 and AWS-Kinesis be configured to leverage Cloud-Availability zones. We list some of the limitations of such a setup. We provide benchmark numbers to show that, once properly configured, RocksDB-Cloud can achieve the same performance as stock RocksDB but with the additional feature of not losing data even if your RocksDB machine fails!
Dhruba is the CTO and Co-founder of Rockset, a stealth-mode startup. Prior to this, he was an engineer in the database team at Facebook where he was the founding engineer of the RocksDB datastore. Earlier at Yahoo, he was the founding engineer of Hadoop Distributed File System. He is a contributor to the open source Apache HBase project. Earlier, he held various roles at Veritas Software, founded an e-commerce startup Oreceipt.com (http://oreceipt.com/) and contributed to Andrew File System (AFS) at IBM-Transarc Labs. Longer version at: https://www.linkedin.com/in/dhruba