Percona Live 2017 Open Source Database Conference

April 24 - 27, 2017

Santa Clara, California

Time series collection and processing in the cloud: integrating OpenTSDB with Google Cloud Bigtable

Time series collection and processing in the cloud: integrating OpenTSDB with Google Cloud Bigtable

 26 April - 11:10 AM - 12:00 PM @ Room 204
Experience level: 
Intermediate
Duration: 
50 minutes conference
Tracks:
Developer
Topics:
Other OSDB
Time Series
NoSQL
Data in the Cloud

Description

Data comes in different shapes. One of the these shapes is time series data. Time series is a very important abstraction since it can be used to describe multiple different processes. You can discover patterns in your website users behavior, capture sensor metrics from industrial equipment or track movement of celestial bodies using time series. The real power of this abstraction lies in providing a simple mechanism for different types of aggregations and analytics. It is easy to calculate minimum and maximum values over a given period of time, calculate average, sums and other statistics. OpenTSDB is a popular open source project that provides a unified way to ingest high-volumes of time series data. OpenTSDB relies on HBase to provide scalable and reliable storage, but implements it’s own logic layer for storing and retrieving data on top of it. One of the challenges in using OpenTSDB at scale is a need to deploy and maintain large HBase cluster. With a public release of Google Cloud Bigtable it is now possible to utilize flexibility of the cloud for HBase-like deployments. Since HBase is based on original Google’s work for Bigtable both systems are compatible on API level. Cloud Bigtable is a managed service and thus it requires minimum maintenance efforts even for large installations. We would like to introduce a result of collaboration between Pythian and Google — an open source add-on to OpenTSDB which enables integration between OpenTSDB and Google Cloud Bigtable. We will demonstrate how to setup a powerful and scalable time series collections system in several minutes. We believe that this project will be attractive to anyone dealing with monitoring systems, sensor data capturing or any other time series systems. During this presentation we will cover the following topics: Overview of time series data and OpenTSDB use cases Google Cloud Big Table properties and features OpenTSDB and Big Table integration details Deployment approach Future development

Speakers

Zburivsky Danil's picture

Zburivsky Danil

Big Data Consultant / Solutions Architect , Pythian

Biography:

Danil Zburivsky, Big Data Consultant/Solutions Architect, Pythian. Danil has been working with databases and information systems since his early years in university, where he received a Master's Degree in Applied Math. Danil has 7 years of experience architecting, building and supporting large mission-critical data platforms using various flavours of MySQL, Hadoop and MongoDB. He is also the author of “Hadoop Cluster Deployment” book. Besides databases Danil is interested in functional programming, machine learning and rock climbing.

Christos Soulios's picture

Christos Soulios

Big Data Architect, Pythian

Biography:

Christos is a principal architect at Pythian creating and delivering Big Data platforms for some of the world's top tech organizations. Having more than 15 years of experience in designing and implementing software, he has a strong interest in building scalable, high throughput systems.

Share this talk