Citus is a PostgreSQL extension developed with the intention of sharding and distributing the data across multiple machines. It offers features like distributed tables, reference tables, schema-based sharding and columnar storage.

We have already covered the basics of Citus and the initial setup part in some earlier blog posts:

How To Scale a Single-Host PostgreSQL Database With Citus

Scalable Solutions with Percona Distribution for PostgreSQL (Part 2): Using Citus

In this discussion, we will focus on integrating the Citus extension in the Patroni HA model and understanding its behavior and how it works.

For the demo purpose, we will use the topology below, which has one coordinator node [Coordinator] and two worker nodes [Worker1 and Worker2] along with their Replicas [Worker1 Replica and Worker2 Replica] running under Patroni.


Let’s walk through the steps required to achieve the full setup.

1) First, we need to install the Citus extension on all Patroni nodes. I was using pgv16, so I chose the package accordingly. For all other distributions, the installation steps are found here: https://docs.citusdata.com/en/stable/installation/multi_node.html#multi-node-citus.

2) Next, we need to define the Citus configuration inside the [patorni.yaml] file for each node.

Group:  We have to define each separate group for coordinator and worker nodes.

Database: Define the Citus database containing distributed and other meta tables. The Citus extension is also automatically created in the defined database

Coordinator:

Worker1:

Worker1 Replica:

Worker2:

Worker2 Replica:

Once we restart the Patroni service on all nodes, it will automatically create the Citus extension and take care of other configurations and creations.

To add any new database, we have to create it alongside the Citus extension on all nodes.

  • Execute on all coordinator and worker nodes.

  • Register nodes in Coordinator.

  • If we have any secondary nodes, then we need to register them as well for each worker node.

  • Finally, create and distribute the table.

Alternatively, for each Leader Role, we can perform dynamic changes and Reload/Restart services accordingly.

E.g.,

Integrating Coordinator/Worker nodes will be taken care of automatically with the additional changes below by Patroni.

  • Citus extension will  automatically added in shared_preload_libraries
  • Patroni will set [bootstrap.dcs.synchronous_mode to quorum]. 
  • Database “citus” will also automatically be created.
  • The coordinator primary node will automatically discover worker primary nodes and add them to the pg_dist_node table using the citus_add_node() function.
  • Patroni will also handle the status in  pg_dist_node in case failover/switchover on the coordinator or worker triggers.

The worker nodes will reflect the “citus” database and extension installation.

Below is the final status we can see in the “patronictl list” output. 

The user required for Patroni management and replication should work in the same manner as it does in native Patroni deployments. So we need to make sure the super user “postgres” and the replication user “replicator” are accessible from other Patroni nodes.

E.g.,

Patroni template from [coordinator] node:

Patroni template from one of the [worker] nodes:

3)  Verifying the node details from the “Coordinator” and “Worker” nodes.

Coordinator:

Worker:

For a specific coordinator and worker group we can list the node details using the “–group” option.

Data distribution

 Let’s now distribute some tables and see how they are kept inside the distributed nodes and their Replicas.

  • Creating a new table

  • Distributing the table based on the “ID” column

  • Inserting some DML to distribute them across worker nodes

We can see below that the distribution works fine and the shards are evenly distributed among other worker nodes.

Failover

The failover/switchover process will work similarly to how it works for normal Patroni nodes. The only difference is that under Citus, we can invoke the failover/switchover with respect to any single group of worker nodes.

In the example below, I am switching the leader of Worker Group “1” from “Worker1” to “Worker1 Replica”.

Post switchover, the status will change for Worker Group “1” nodes as below. 

Other maintenance operations, such as restarting Patroni, will work the same way they do with standard Patroni nodes.

E.g.,

Final thought

This seamless integration of Citus and Patroni has unlocked the power of both write scalability and high availability features. Citus solves the problem of managing large datasets and individual big tables on a single machine by providing different sharding/distribution methods, which basically distribute or scale data over different worker nodes. The ability to fail over individual worker nodes and read from secondary workers is just a cherry on top.

Lastly, we don’t encourage using Citus in production unless the user fully assesses the pros/cons. Thorough testing in non-production is very crucial before implementing Citus in production, as the add-on extension could have its own challenges and can affect the internal Postgres metadata and behavior.

Disclaimer: This blog is purely based on my evaluation of the Citus extension and does not reflect Percona’s recommendations/suggestions. I would suggest evaluating possible support coverage before planning for any production usage.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments