OpenTSDB - Time Series Schema on Schemaless NoSQL
26 April - 1:00 PM - 1:50 PM @ Room 204
50 minutes conference
This talk will cover the special case of time series data and the evolution of various schemas from RRD files to RDBMS schemas to NoSQL stores. Particularly we'll focus on why, as the amount of time series data grows and slicing the data by various dimensions becomes important, many users eschewed RDBMS for NoSQL or custom data layers. We'll look at: * RRDTool * RDBMS * Single table RDBMS * Single table RDBMS with multi-dimensions * Partitioning RDBMS by time * Partitioning RDBMS by time and dimension * Moving to Key Value stores * Introduction to distributed hash tables (HBase, Bigtable, Cassandra 1.x) * OpenTSDB's schema on top of these tables * Alternative schemas on hash tables * Pros and Cons vs an RDBMS solution * (Time permitted) newer time series specific data stores (Druid, InfluxDB, Gorilla, others)
Sr Software Engineer, Yahoo Inc.
Developer and manager for OpenTSDB, an open source time series database.