Percona Blog Poll Results: What Database Engine Are You Using to Store Time Series Data?Colin Charles
In this blog post, we talk about the results of Percona’s time series database poll “What Database Engine Are You Using to Store Time Series Data?”
Time series data is some of the most actionable data available when it comes to analyzing trends and making predictions. Simply put, time series data is data that is indexed not just by value, but by time as well – allowing you to view value changes over time as they occur. Obvious uses include the stock market, web traffic, user behavior, etc.
With the increasing number of smart devices in the Internet of Things (IoT), being able to track data over time is more and more important. With time series data, you can measure and make predictions on things like energy consumption, pH values, water consumption, data from environment-aware machines like smart cars, etc. The sensors used in IoT devices and systems generate huge amounts of time-series data.
A couple of months back, we ran a poll on what time series databases were being used by the community. We wanted to quickly report on the results from that poll.
First the results:
What database engine are you using to store time series data?
- Relational Database (MySQL, Percona Server for MySQL, MariaDB Server, PostgreSQL, etc) (35%, 657 Votes)
- ElasticSearch (16%, 293 Votes)
- InfluxDB (14%, 269 Votes)
- General Purpose NoSQL Engine (MongoDB, Percona Server for MongoDB, Cassandra, Couchbase, etc) (10%, 187 Votes)
- Prometheus (8%, 142 Votes)
- Graphite (7%, 140 Votes)
- Other (4%, 68 Votes)
- ClickHouse (3%, 55 Votes)
- OpenTSDB (2%, 46 Votes)
- DalmatinerDB (1%, 11 Votes)
- KairosDB (0%, 8 Votes)
- RiakTS (0%, 5 Votes)
Total Voters: 1,466
Here are some thoughts:
- The fact that this blog started as a place exclusively for MySQL information probably explains why we skewed high with MySQL respondents – still that doesn’t mean it doesn’t reflect reality.
- Elastic seems the most common after that, possibly to tie in with MySQL use.
- InfluxDB as next popular. This suggests that Paul Dix’s chosen business model is “AOK” so to speak. It is unclear if people use the open source version, or outgrow it and switch to the commercial stuff.
- We lumped together “general purpose NoSQL engine”, but in some cases examples like Cassandra are targeted at time series. Notice that KairosDB, which is built on top of Cassandra itself, is not as popular in our survey.
- Prometheus is the canonical “not a time series database”, but still used as one. I have a feeling alongside Graphite, this is monitoring related.
- ClickHouse time series is a new time series database and it is surprising that it gets such high rankings. It was also relatively unknown outside of its home country Russia, but now we are seeing uses at places like CloudFlare and more.
Thanks for participating in the poll. We’re still running a poll on operating systems, so don’t forget to register your responses. We’ll report on that poll soon, with a new one on the way shortly.