ClickHouse Performance Uint32 vs Uint64 vs Float32 vs Float64

While implementing ClickHouse for query executions statistics storage in Percona Monitoring and Management (PMM),  we were faced with a question of choosing the data type for metrics we store. It came down to this question: what is the difference in performance and space usage between Uint32, Uint64, Float32, and Float64 column types?

To test this, I created a test table with an abbreviated and simplified version of the main table in our ClickHouse Schema.

The “number of queries” is stored four times in four different columns to be able to benchmark queries referencing different columns.  We can do this with ClickHouse because it is a column store and it works only with columns referenced by the query. This method would not be appropriate for testing on MySQL, for example.

When testing ClickHouse performance you need to consider compression. Highly compressible data (for example just a bunch of zeroes) will compress very well and may be processed a lot faster than incompressible data. To take this into account we will do a test with three different data sets:

  • Very Compressible when “number of queries” is mostly 1
  • Somewhat Compressible when we use a range from 1 to 1000 and
  • Poorly Compressible when we use range from 1 to 1000000.

Since it’s unlikely that an application will use the full 32 bit range, we haven’t used it for this test.

Another factor which can impact ClickHouse performance is the number of “parts” the table has. After loading the data we ran OPTIMIZE TABLE FINAL to ensure only one part is there on the disk. Note: ClickHouse will gradually delete old files after the optimize command has completed. To avoid these operations interfering with benchmarks, I waited for about 15 minutes to ensure all unused data was removed from the disk.

The amount of memory on the system was enough to cache whole columns in all tests, so this is an in-memory test.

Here is how the table with only one part looks on disk:

When you have only one part it makes it very easy to see the space different columns take:

We can see there are two files for every column (plus some extras), and so, for example, the Float32 based “number of queries” metric store takes around 1.5GB.

You can also use the SQL queries to get this data from the ClickHouse system tables instead:

Now let’s look at the queries

We tested with two queries.  One of them – we’ll call it Q1 – is a very trivial query, simply taking the sum across all column values. This query needs only to access one column to return results so it is likely to be the most impacted by a change of data type:

The second query – which we’ll call Q2 – is a typical ranking query which computes the number of queries per period and then shows periods with the highest amount of queries in them:

This query needs to access two columns and do more complicated processing so we expect it to be less impacted by the change of data type.

Before we get to results I think it is worth drawing attention to the raw performance we’re getting.  I did these tests on DigitalOcean Droplet with just six virtual CPU cores, yet still I see numbers like these:

Processing more than 300M rows/sec per core and more than 1GB/sec per core is very cool!

Query Performance

Results between different compression levels show similar differences between column types, so let’s focus on those with the least compression:

Q1 least compression

Q2 least compression

As you can see, the width of the data type (32 bit vs 64 bit) matters a lot more than the type (float vs integer). In some cases float may even perform faster than integer. This was the most unexpected result for me.

Another metric ClickHouse reports is the processing speed in GB/sec. We see a different picture here:

Q1 GB per second

64 bit data types have a higher processing speed than their 32 bit counter parts, but queries run slower as there is more raw data to process.

Compression

Let’s now take a closer look at compression.  For this test we use default LZ4 compression. ClickHouse has powerful support for Per Column Compression Codecs but testing them is outside of scope for this post.

So let’s look at size on disk for UInt32 Column:

On disk data size for UINT32

What you can see from these results is that when data is very compressible ClickHouse can compress it to almost nothing.  The compression ratio for our very compressible data set is about 200x (or 99.5% size reduction if you prefer this metric).

Somewhat compressible data compression rate is 1.4x.  That’s not bad but considering we are only storing 1-1000 range in this column – which requires 10 bits out of 32 – I would hope for better compression. I guess LZ4 is not compressing such data very well.

Now let’s look at compression for a 64 bit integer column:

On disk data size for UINT64

We can see that while the size almost doubled for very compressible data, increases for our somewhat compressible data and poorly compressible data are not that large.  Somewhat compressible data now compresses 2.5x.

Now let’s take a look at Performance depending on data compressibility:

Q1 time for UINT32

Poorly compressible data which takes a larger space on disk is processed faster than somewhat compressible data? This did not make sense. I repeated the run a few times to make sure that the results were correct. When I looked at the compression ratio, though, it suddenly made sense to me.

Poorly compressible data for the UInt32 data type was not compressible by LZ4 so it seems the original data was stored, significantly speeding up “decompression” process.   With somewhat compressible data, compression worked and so real decompression needed to take place too. This makes things slower.

This is why we can only observe these results with UInt32 and Float32 data types.  UInt64 and Float64 show the more expected results:

Q1 time for UINT64

Summary

Here are my conclusions:

  • Even with “slower” data types, ClickHouse is very fast
  • Data type choice matters – but less than I expected
  • Width (32bit vs 64bit) impacts performance more than integer vs float data types
  • Storing a small range of values in a wider column type is likely to yield better compression, though with default compression it is not as good as theoretically possible
  • Compression is interesting. We get the best performance when data can be well compressed. Second best is when we do not have to spend a lot of time decompressing it, as long as it is fits in memory.

Share this post

Comments (3)

  • Andy Reply

    > Poorly compressible data for the UInt32 data type was not compressible by LZ4 so it seems the original data was stored, significantly speeding up “decompression” process.

    Shouldn’t this be the case for UInt64 as well? The how come Q1 for UInt64 don’t exhibit the same “poorly compressible data faster than somewhat compressible data” we see for UInt32?

    February 15, 2019 at 8:48 pm
  • Peter Zaitsev Reply

    Hi Andy,

    With UINT64 we have other story. As value range is only 1.000.000 we will have only 3 bytes needed to represent it and 5 leading bytes are zeroes, which gives a lot of room for compression even for relatively simple algorithm as LZ4

    February 15, 2019 at 9:21 pm
  • Denis Zhuravlev Reply

    There is a new feature — codecs CODEC(Delta(1), ZSTD))

    create table codec_test(t UInt64, a Int32) engine = MergeTree() order by (t);
    insert into codec_test select number, number/10000 from numbers(10000000);
    insert into codec_test select number, number/10000+1 from numbers(10000000);
    select sleep(1);
    optimize table codec_test final;
    select data_compressed_bytes from system.columns where table = ‘codec_test’ and name = ‘a’;
    ┌─data_compressed_bytes─┐
    │ 375722 │
    └───────────────────────┘

    create table codec_test1(t UInt64, a Int32 CODEC(Delta(1), ZSTD)) engine = MergeTree() order by (t);
    insert into codec_test1 select number, number/10000 from numbers(10000000);
    insert into codec_test1 select number, number/10000+1 from numbers(10000000);
    select sleep(1);
    optimize table codec_test1 final;
    select data_compressed_bytes from system.columns where table = ‘codec_test1’ and name = ‘a’;
    ┌─data_compressed_bytes─┐
    │ 98989 │
    └───────────────────────┘

    February 19, 2019 at 7:12 pm

Leave a Reply