Columnar stores like ClickHouse enable users to pull insights from big data in seconds, but only if you set things up correctly. This talk will walk through how to implement a data warehouse that contains 1.3 billion rows using the famous NY Yellow Cab ride data. We'll start with basic data implementation including clustering and table definitions, then show how to load efficiently. Next we'll discuss important features like dictionaries and materialized views, and how they improve query efficiency. We'll end by demonstrating typical queries to illustrate the kind of inferences you can draw rapidly from a well-designed data warehouse. It should be enough to get you started--the next billion rows is up to you!
Alexander Zaitsev is a co-founder and CTO of Altinity. He has been involved in software development, alongside academic research, since 1997. Alexander's interests include distributed architecture, databases and analytics. His focus is on building analytics solutions using database management systems capable of processing petabytes of data, such as Vertica and ClickHouse. Alexander has a Master's degree in mathematics and computer science from Lomonosov Moscow State University.
Robert Hodges is currently CEO of Altinity, which offers software and services for the ClickHouse data warehouse. His experience in databases dates includes relational database work at Sybase, developing SaaS applications on Oracle, and over 10 years working on replication and clustering products for MySQL. He was CEO of Continuent, Inc., when it was acquired by VMware in 2014. After 4 years working on multi-tenant cloud services, disaster recovery, and security he returned to data management at Altinity in 2019.