PostgreSQL: Access ClickHouse, One of the Fastest Column DBMSs, With clickhousedb_fdw

PREVIOUS POST
NEXT POST

Database management systems are meant to house data but, occasionally, they may need to talk with another DBMS. For example, to access an external server which may be hosting a different DBMS. With heterogeneous environments becoming more and more common, a bridge between the servers is established. We call this bridge a “Foreign Data Wrapper” (FDW). PostgreSQL completed its support of SQL/MED (SQL Management of External Data) with release 9.3 in 2013. A foreign data wrapper is a shared library that is loaded by a PostgreSQL server. It enables the creation of foreign tables in PostgreSQL that act as proxies for another data source.

When you query a foreign table, Postgres passes the request to the associated foreign data wrapper. The FDW creates the connection and retrieves or updates the data in the external data store. Since PostgreSQL planner is involved in all of this process as well, it may perform certain operations like aggregate or joins on the data when retrieved from the data source. I cover some of these later in this post.

ClickHouse Database

ClickHouse is an open source column based database management system which claims to be 100–1,000x faster than traditional approaches, capable of processing of more than a billion rows in less than a second.

clickhousedb_fdw

clickhousedb_fdw is an open source project – Apache licensed – from Percona. Here’s the link for GitHub project repository:

https://github.com/Percona-Lab/clickhousedb_fdw

It is an FDW for ClickHouse that allows you to SELECT from, and INSERT INTO, a ClickHouse database from within a PostgreSQL v11 server.

The FDW supports advanced features like aggregate pushdown and joins pushdown. These significantly improve performance by utilizing the remote server’s resources for these resource intensive operations.

If you would like to follow this post and try the FDW between Postgres and ClickHouse, you can download and set up the ontime dataset for ClickHouse.  After following the instructions, the test that you have the desired data. The ClickHouse client is a client CLI for the ClickHouse Database.

Prepare Data for ClickHouse

Now the data is ready in ClickHouse, the next step is to set up PostgreSQL. We need to create a ClickHouse foreign server, user mapping, and foreign tables.

Install the clickhousedb_fdw extension

There are manual ways to install the clickhousedb_fdw, but clickhousedb_fdw uses PostgreSQL’s coolest extension install feature. By just entering a SQL command you can use the extension:

Performance Features

PostgreSQL has improved foreign data wrapper processing by added the pushdown feature. Push down improves performance significantly, as the processing of data takes place earlier in the processing chain. Push down abilities include:

  • Operator and function Pushdown
  • Predicate Pushdown
  • Aggregate Pushdown
  • Join Pushdown

Operator and function Pushdown

The function and operators send to Clickhouse instead of calculating and filtering at the PostgreSQL end.

Predicate Pushdown

Instead of filtering the data at PostgreSQL, clickhousedb_fdw send the predicate to Clikhouse Database.

Aggregate Pushdown

Aggregate push down is a new feature of PostgreSQL FDW. There are currently very few foreign data wrappers that support aggregate push down – clickhousedb_fdw is one of them. Planner decides which aggregates are pushed down and which aren’t. Here is an example for both cases.

Join Pushdown

Again, this is a new feature in PostgreSQL FDW, and our clickhousedb_fdw also supports join push down. Here’s an example of that.

Percona’s support for PostgreSQL

As part of our commitment to being unbiased champions of the open source database eco-system, Percona offers support for PostgreSQL – you can read more about that here. And as you can see, as part of our support commitment, we’re now developing our own open source PostgreSQL projects such as the clickhousedb_fdw. Subscribe to the blog to be amongst the first to know of PostgreSQL and other open source projects from Percona.

As an author of the new clickhousdb_fdw – as well as other  FDWs – I’d be really happy to hear of your use cases and your experience of using this feature.


Photo by Hidde Rensink on Unsplash

PREVIOUS POST
NEXT POST

Share this post

Comments (8)

  • Andrey Reply

    it would be great if someone made a similar extension for MySQL

    April 5, 2019 at 10:19 am
    • Steven Lee Reply

      please google Using ClickHouse like MySQL by ProxySQL

      April 21, 2019 at 8:48 pm
  • faibistes Reply

    What functions are supported in function pushdown? IIUC, it means that it translates postgresql functions and operators to clickhouse equivalents, if there is one. Am I right?

    April 12, 2019 at 7:30 am
    • Ibrar Ahmed Reply

      I tried to translate operators, but not the function. If function exists in PostgreSQL and clickhouse then it will work.

      April 12, 2019 at 1:59 pm
      • faibistes Reply

        Does it mean that it will fail if it uses a non-translatable function in a join or aggregate, or that it will do it on the postgresql side?

        April 13, 2019 at 4:53 am
        • Ibrar Ahmed Reply

          In this current state it will fail, I am working on the translation functions. It will fail like this function does not exists in Clickhouse. You need to call a function which is available in Clickhouse, otherwise it will fail.

          April 13, 2019 at 1:16 pm
  • Magnus Falch Reply

    CREATE EXTENSION clickhousedb_fdw;
    Which distros/repos will this work with currently?

    April 21, 2019 at 3:24 pm
  • Steven Lee Reply

    Hi,
    I translated it into Chinese, and want to publish on my blog, hope you can allow it.
    and I have successfully compiled, installed and tested it, it is great.
    and I wrote a compiling method in details, http://blog.rubypdf.com/2019/04/22/how-to-compile-and-install-clickhousedb_fdw/

    April 21, 2019 at 8:45 pm

Leave a Reply