From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication

MySQL & NoSQL
3 April 1:00PM - 1:50PM @ Ballroom D

Experience level: 
Intermediate
Duration: 
50 minutes conference
Getting data into Hadoop is not difficult, but it is complex if what yo want to do is load 'live' or semi-live data into your Hadoop cluster from your MySQL databases. There are plenty of solutions available, from manually dumping and loading to the good and bad sides of using a tool like Sqoop. Neither are easy and both prone to the problems of lag between the moment you perform the dump and the load into Hadoop. Replicating into Hadoop with Tungsten Replicator enables you to stream replication data from your MySQL servers straight into Hadoop. Using the leading replication service built into Tungsten Replicator, and supporting all the topology and reliability features of Tungsten Replicator, the Hadoop applier enables you to replicate data directly from MySQL into Hadoop. This session will include a look at the existing methods of loading Hadoop data, an examination of how the Hadoop replicator works, and a live demo of replicating data from MySQL into Hadoop. + Traditional Loading Methods + Sqoop: Your Data Loading Frenemy + Replicating from MySQL + How the Hadoop Replicator Works + Live Demo of replication


Speakers

Senior Product Line Manager, VMware
Biography: 
A professional writer and technologist for over 20 years, MC Brown is the author and contributor to over 26 books covering an array of topics, including programming, system management, networking, data centres and web technologies. His expertise spans myriad development languages and platforms, with Systems using Perl, Python, Java, JavaScript, C, C++, Shellscript, Windows, Solaris, Unix, HP-UX, Open Source, Linux, BeOS, Mac OS/X and many more. A former LAMP Technologies Editor for LinuxWorld magazine, and a regular contributor to ServerWatch.com, LinuxPlanet, ComputerWorld and IBM developerWorks. As a Subject Matter Expert for Microsoft for Windows Server and server certification projects. He draws on a rich and varied background as founder member of a leading UK ISP, systems manager and IT consultant for an advertising agency and Internet solutions group, technical specialist for an intercontinental ISP network, and database designer and programmer. Most recently he has concentrated on building high quality user-focused information and products through his books, articles, and MySQL and the MySQL groups within Sun and Oracle. In addition to producing the content and the content-delivery systems, including building documentation, white papers, and marketing materials. Throughout his career he has acted as architectural advisor to a wide variety of products, focused on user-centric functionality and use cases with an eye on enhancing current feature-sets with an eye to future functionality and requirements in a flexible way to ensure both ease of use and development. These activities have led to a keen eye and experience in Big Data, Hadoop, MySQL, NoSQL, Oracle, virtualisation, datacentres, content delivery, data migration and replication technology for heterogeneous databases.
Senior Software Engineer, Continuent Inc.
Biography: 
Linas has extensive experience in developing heterogeneous replication solutions between MySQL, Oracle and PostgreSQL. Implemented support for MySQL to Oracle/PostgreSQL/Greenplum replication and, also, replication POC from PostgreSQL to other DBMS. In addition to developing, he's helping heterogeneous replication customers get deployed. Before joining Continuent, Linas was Head of IT at FBC "Finasta".

Slides