From megabytes to terabytes of data a day: a work in progress at Improve Digital

4 December 16:20 - 17:10 @ Orchard 1

This presentation will describe the challenges encountered at Improve Digital as we migrate from a system based on -- and relying on -- data aggregates to one processing much larger volumes of ‘raw’ data.
The architectural evolution we will discuss explains the changing role of the components of our system; MySQL, Postgres and Hadoop clusters and a new analytic data warehouse.
At the design level we will explore how we partitioned data and workload across the various components and what strategies for data movement and high availability are utilised.
We do not have the luxury of turning off the current system until the new architecture is ready and we will highlight some of the tactical changes that were made to enable the current systems to keep up with demand and discuss some of the more challenging trade-offs we encountered.


Garry Turkington
VP Data Engineering, Improve Digital
Speaker Biography: 
Garry Turkington is responsible for the design and implementation of the systems at Improve Digital that store, process and extract value from the company data. Prior to Improve Digital he worked at where he led a software development team building catalog systems. Before that he spent 12 years in government agencies working on large-scale distributed computing. He has BSc and PhD degrees in computer science from the Queens University of Belfast and a MEng in system engineering from Stevens Institute of Technology.