Percona Live: Data Performance Conference 2016 Logo

April 18-21, 2016

Santa Clara, California

Espresso: LinkedIn's distributed document store on top of MySQL

Espresso: LinkedIn's distributed document store on top of MySQL

 20 April 04:30 PM - 05:20 PM @ Ballroom E
Experience level: 
50 minutes conference
Case Stories
Data as a Service


Espresso is LinkedIn’s new online, distributed, fault-tolerant NoSQL database that currently powers several LinkedIn applications. In this presentation we will discuss the motivation behind Espresso, describe its architecture and how we leverage MySQL for storage and replication. Among other topics, we will explore how MySQL's binary logging and GTID were extended to provide a data feed for Kakfa-based replication (within a cluster) and for change capture stream. Additionally, these extensions will be discussed in the context of custom application failover and how it allows for better hardware utilization/cost to serve in multi-tentant clusters. Some custom features that will be covered: - Binary log compression - New GTID source_id mode for sequence number generation (SCN) - New GTID functions for custom replication management


Davi Arnaut's picture

Davi Arnaut

Software Engineer, LinkedIn


Davi Arnaut works on leveraging MySQL to support LinkedIn's distributed data storage platform. In the past, Davi was a part of the the MySQL development teams at Twitter and Oracle.

Eun-Gyu Kim's picture

Eun-Gyu Kim

Staff Software Engineer, LinkedIn


Eun-Gyu Kim is an Engineering Manager at LinkedIn. Currently his team works at Espresso - a horizontally scalable NoSQL data store at LinkedIn. Please visit this page to get more information on the project:

Yun Sun's picture

Yun Sun

Staff Software Engineer, LinkedIn


Yun Sun works on the Espresso project, which is a distributed, fault-tolerant storage system with secondary index capability. Yun's major responsibility includes secondary index, storage node stack, REST client development. Yun also takes part in engagement activities with Espresso users on their data model and cluster maintenance.

Share this talk