From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication
MySQL & NoSQL
3 April 1:00PM - 1:50PM @ Ballroom D
50 minutes conference
Getting data into Hadoop is not difficult, but it is complex if what yo want to do is load 'live' or semi-live data into your Hadoop cluster from your MySQL databases. There are plenty of solutions available, from manually dumping and loading to the good and bad sides of using a tool like Sqoop. Neither are easy and both prone to the problems of lag between the moment you perform the dump and the load into Hadoop. Replicating into Hadoop with Tungsten Replicator enables you to stream replication data from your MySQL servers straight into Hadoop. Using the leading replication service built into Tungsten Replicator, and supporting all the topology and reliability features of Tungsten Replicator, the Hadoop applier enables you to replicate data directly from MySQL into Hadoop. This session will include a look at the existing methods of loading Hadoop data, an examination of how the Hadoop replicator works, and a live demo of replicating data from MySQL into Hadoop. + Traditional Loading Methods + Sqoop: Your Data Loading Frenemy + Replicating from MySQL + How the Hadoop Replicator Works + Live Demo of replication
Director of Documentation, Continuent
Senior Software Engineer, Continuent Inc.
Linas has extensive experience in developing heterogeneous replication solutions between MySQL, Oracle and PostgreSQL. Implemented support for MySQL to Oracle/PostgreSQL/Greenplum replication and, also, replication POC from PostgreSQL to other DBMS. In addition to developing, he's helping heterogeneous replication customers get deployed. Before joining Continuent, Linas was Head of IT at FBC "Finasta".