InnoDB as backend of Manhattan - Twitter's distributed database
Manhattan is a high performance, multi-tenant, eventually consistent KV store developed at Twitter. Manhattan comes with native support for LSM backend. Its architecture supports multiple backend solutions. For use cases where a btree backend provides more scalable and predictable performance we have developed a solution to use InnoDB as the storage engine of choice. Along the way, we learned many lessons and have to make various design choices. This talk is all about these lessons and choices. For example, where to plug into MySQL to talk to InnoDB, how to map KV model to an InnoDB schema, how to overcome key size limitation, how does tombstone and TTL based workload play out in the InnoDB world etc. We'll also talk about the changes made to MySQL/InnoDB code to support streaming operations and single table backups. Finally, we'll touch upon the challenges we faced as we put the whole beast to test with real life production workloads.
Staff Software Engineer, Twitter
Inaam is a MySQL Internals Engineer at Twitter. Before joining Twitter, Inaam was a member of InnoDB team at Oracle. His area of focus has been performance and scalability of the InnoDB storage engine. He has been mostly working on the buffer cache layer, low level concurrency, IO subsystem and logging/recovery. Over the years, Inaam has had a chance to contribute to many scalability related features in InnoDB plugin, MySQL-5.5 and MySQL-5.6. In his previous work experience he had been involved in development of IBM’s DB2 LUW and PostgreSQL database engines. Inaam currently lives in Toronto, Canada.
Senior Software Engineer, Twitter
Unmesh Jagtap is a Senior Software Engineer in the core storage team at Twitter. He is the key developer working on BTree backend for Manhattan - Twitter's highly scalable key value store. Before Twitter, Unmesh was a Principal member in the parallel execution group at Oracle. He has worked on Oracle optimizer and parallel execution layers. While at Oracle he worked on interesting problems like dynamic query execution plans to handle optimizer estimation errors, optimizing performance for single and multi column null aware anti joins, adding NUMA capabilities to parallel execution engine, improving scalibility of parallel execution infrastructure, adding support for heterogenous hardware in Oracle database and developing a low overhead parallel execution framework for multi-tenant databases. He enjoys working on massively parallel, scalable and fault tolerant distributed systems.