From megabytes to terabytes of data a day: a work in progress at Improve Digital
This presentation will describe the challenges encountered at Improve Digital as we migrate from a system based on -- and relying on -- data aggregates to one processing much larger volumes of ‘raw’ data.
The architectural evolution we will discuss explains the changing role of the components of our system; MySQL, Postgres and Hadoop clusters and a new analytic data warehouse.
At the design level we will explore how we partitioned data and workload across the various components and what strategies for data movement and high availability are utilised.
We do not have the luxury of turning off the current system until the new architecture is ready and we will highlight some of the tactical changes that were made to enable the current systems to keep up with demand and discuss some of the more challenging trade-offs we encountered.