I will present a subset of the most notable ClickHouse features over the last half of year:
- data skipping indices, including full text indices (with performance evaluation and insights on implementation);
- custom compression codecs for time series data;
- HDFS and Parquet integration;
- fuzzy string search (it is really fast fuzzy string search); multiple substring matching;
- sampling profiler on the query level;
- z-curve indexing;
- table and columns TTL;
- Moscow State Unversity, department of Mechanics and Mathematics - specialist degree, mathematician (2003-2008);
Yandex LLC https://yandex.com/company/ (2008-now):
- head of ClickHouse development team (2015-now) (https://clickhouse.yandex/);
- head of Yandex Metrica engine development team (2012-2015) (https://metrica.yandex.com/);
- senior software developer of Yandex Metrica engine (2010-2012);
- software developer of Yandex Metrica engine (2008-2010);
Since 2008 I was working on development of data processing pipeline of Yandex Metrica - web analytics system.
Since 2015 I am responsible for development of ClickHouse - open-source column-oriented database management system, used for realtime analytical reports (https://clickhouse.yandex/).
I have 11 years experience with development of specialized data structures using C++ programming language.