Building a Hadoop/MySQL Hybrid Datawarehouse

When building a datawarehouse, if your volumes are high enough, it can start to make sense to use Hadoop to grind through huge amounts of data to create a responsive environment for BI users.

In this talk, Tim Ellis describes a real-world datawarehouse implementation using a hybrid approach. Data is placed into HDFS from many different sources (datamarts), processed using the SQL-like language Hive, and results are placed in MySQL for further high-speed low-turnaround-time analytics by the BI team.

Track: 
Trends in Architecture and Design
Experience level: 
Intermediate

Schedule info

Room: 
Murray Hill

Schedule Info

2 October 11:30 - 12:20 @
Murray Hill

Speakers

Tim Ellis's picture
CTO, PalominoDB

Tim Ellis presided over some of the larger MySQL installations during the 2000-2011 timeframe at such places as Digg, Mozilla, Riot Games (League of Legends), and StumbleUpon. Running large scaled database installations using Opsdev methodologies was his specialty.

In the last 5 years, he has begun specialising in building hybrid database clusters, using MySQL for relational data and various distributed databases (often called "NoSQL") to store key/value or sparse hash map data.

He has always been a strong proponent of using the right tool for the job, which can sometimes be a surprising and radical notion.

Slides


Sponsored By