Beringei: Facebook's Open Source, In-Memory Time Series Database (TSDB)
In December 2016, the health and performance monitoring team at Facebook open sourced our in-memory time series database: Beringei. Beringei is different from other in-memory systems, such as memcache, because it has been heavily optimized for storing time series data used specifically for health and performance monitoring. We optimized Beringei to have a very high write rate and low read latency, while being as efficient as possible in using RAM to store the time series data. In the end, we created a system that can store *all *the performance and monitoring data generated at Facebook for the most recent twenty-four hours, allowing for extremely fast exploration and debugging of systems and services as we encounter issues in production. In production, the system is able to hold billions of time series, respond to queries in microseconds, and sustain thousands of queries per second. Operating large-scale, globally distributed services requires accurate monitoring of the health and performance of our systems to identify and diagnose problems as they arise. Facebook uses a time series database (TSDB) to track and store system measurements such as product stats (*e.g.,* how many messages are being sent per minute), service stats (*e.g.,* the rate of queries hitting the cache tier vs the MySQL tier), and system stats (*e.g., *CPU, memory, and network usage), so that we can see the real-time load on our infrastructure and make decisions about how to allocate resources. Each system and service at Facebook writes hundreds to thousands of counters to our storage engine, and this data is available to query in real time by engineers looking at dashboards and doing ad-hoc performance analysis. In early 2013, our monitoring team realized that our existing HBase-backed TSDB would not scale to handle future read loads as our systems and company grew. Our average read latency was acceptable for looking at a small amount of data, but trying to visualize more data through interactive dashboards resulted in a poor experience. The 90th percentile query time had increased to multiple seconds, which is unacceptable for automated tools that may need to issue hundreds or thousands of queries to do an analysis. Medium-sized queries of a few thousand time series took tens of seconds to execute, and larger queries executing over sparse datasets would time out since the HBase data store was tuned to prioritize writes. In general, large-scale monitoring systems cannot handle large-scale analysis in real time because the query performance is too slow. After evaluating and rejecting several disk-based and existing in-memory cache solutions, we turned our attention to writing our own in-memory TSDB to power the health and performance monitoring system at Facebook. We presented “Gorilla: A Fast, Scalable, In-Memory Time Series Database (http://www.vldb.org/pvldb/vol8/p1816-teller.pdf)” at VLDB 2015. In December 2016, we open sourced the majority of that work with Beringei (https://github.com/facebookincubator/beringei). In this talk, we will start by presenting how we use Beringei to serve production monitoring workloads at Facebook, with an overview of how we use it as the basis for a disaster-ready, high performance distributed system. We will close by presenting some new performance analysis comparing Beringei to Prometheus. Prometheus is an open source TSDB whose time series compression was inspired by the Gorilla VLDB paper and has similar compression behavior.
Engineering Manager, Facebook
I am an accomplished software engineer with experience optimizing applications and system software for parallel architectures, including current multi-core and future many-core architectures. I have significant kernel development experience focused on thread scheduling, having written a unique OS thread scheduler focused on graphics workloads. I am currently researching future many-core hardware; I am focusing specifically on novel software architectures and tools. I am also interested in opportunities to write tools and software to utilize reconfigurable architectures and hardware (such as FPGAs).