When building a datawarehouse, if your volumes are high enough, it can start to make sense to use Hadoop to grind through huge amounts of data to create a responsive environment for BI users.
In this talk, Tim Ellis describes a real-world datawarehouse implementation using a hybrid approach. Data is placed into HDFS from many different sources (datamarts), processed using the SQL-like language Hive, and results are placed in MySQL for further high-speed low-turnaround-time analytics by the BI team.
Track:Trends in Architecture and Design
2 October 11:30 - 12:20 @
Tim Ellis presided over some of the larger MySQL installations during the 2000-2011 timeframe at such places as Digg, Mozilla, Riot Games (League of Legends), and StumbleUpon. Running large scaled database installations using Opsdev methodologies was his specialty.
In the last 5 years, he has begun specialising in building hybrid database clusters, using MySQL for relational data and various distributed databases (often called "NoSQL") to store key/value or sparse hash map data.
He has always been a strong proponent of using the right tool for the job, which can sometimes be a surprising and radical notion.