LinkedIn is a global site is served from multiple data centers (a.k.a colos). Member Data written at each data center is globally replicated to other data centers. To avoid write latency, we choose the replication to be async which has lead to a lot of problems related to conflicts. This talk is about why global replication is needed, how are we leveraging the multi-colo replication for site-up using [traffic shift](https://engineering.linkedin.com/blog…), how are we using Kafka to do MultiColo replication, how we architected our applications and the schema to minimize conflicts and finally how to handle conflicts in case if they arise.
So far LinkedIn has been using [Espresso](https://engineering.linkedin.com/espr…) and Oracle as primary data stores. There are already tools developed for handling the multi-colo replication which are covered in this talk. MySQL is growing very rapidly at LinkedIn and we are in search of an OpenSource and reliable async multi-colo replication. I hope this talk may stress the need for it and let the open source community come up with good solutions.
Speaker: Karthik Appigatla – LinkedIn