How Patroni Addresses the Problem of the Logical Replication Slot Failover in a PostgreSQL Cluster

January 13, 2022
Author
Jobin Augustine
Share this Post:

Failover of the logical replication slot has always been the pain point while using the logical replication in PostgreSQL. This lack of feature undermined the use of logical replication and acted as one of the biggest deterrents. The stake and impact were so high that many organizations had to discard their plans around logical replication, and it affected many plans for migrations to PostgreSQL. It was painful to see that many had to opt for proprietary/vendor-specific solutions instead.

At Percona, we have written about this in the past: Missing Piece: Failover of the Logical Replication Slot.  In that post, we discussed one of the possible approaches to solve this problem, but there was no reliable mechanism to copy the slot information to a Physical standby and maintain it.

The problem, in nutshell, is: the replication slot will be always maintained on the Primary node. If there is a switchover/failover to promote one of the standby, the new primary won’t have any idea about the replication slot maintained by the previous primary node. This breaks the logical replication from the downstream systems or if a new slot is created, it becomes unsafe to use.

The good news is that Patroni developers and maintainers addressed this problem from Version 2.1.0 and provided a working solution without any invasive methods/extensions. For me, this is a work that deserves a big round of applause from the Patroni community and that is the intention of this blog post and to make sure that a bigger crowd is aware of it.

How to Set it Up

A ready-to-use Patroni package is available from the Percona repository. But you are free to use Patroni from any source.

Basic Configuration

In case you are excited about this and want to try it, the following steps might be helpful.

The entire discussion is about logical replication. So the minimum requirement is to have a wal_level set to “logical”. If the existing Patroni configuration is having wal_level set to “replica” and if you want to use this feature, you may just edit the Patroni configuration.

Patroni configuration

However, this change requires the PostgreSQL restart:

“Pending restart” with * marking indicates the same.

You may use Patroni’s “switchover” feature to restart the node to make the changes into effect because the demoted node goes for a restart.

If there are any remaining nodes, they can be restarted later.

Creating Logical Slots

Now we can add a permanent logical replication slot to PostgreSQL which will be maintained by Patroni.

Edit the patroni configuration:

A slot specification can be added as follows:

The “slots:” section defines permanent replication slots. These slots will be preserved during switchover/failover. “pgoutput” is the decoding plugin for PostgreSQL logical replication.

Enterprise PostgreSQL

Once the change is applied, the logical replication slot will be created on the primary node. Which can be verified by querying:

The following is a sample output:

patroni output

Now here is the first level of magic! The same replication slot will be created on the standbys, also. Yes, Patroni does it. Patroni internally copies the replication slot information from the primary to all eligible standby nodes!.

We can use the same query on the pg_replication_slots on the standby and see similar information.

The following is an example showing the same replication slot reflecting on the standby side:

replication slot

This slot can be used by the subscription by explicitly specifying the slot name while creating the subscription.

Alternatively, an existing subscription can be modified to use the new slot which I generally prefer to do.

For example:

Corresponding PostgreSQL log entries can confirm the slot name change:

From the publisher side, We can confirm the slot usage by checking the active_pid and advancing LSN for the slots.

The second level of Surprise! The Replication Slot information in all the standby nodes of the Patroni cluster is also advanced as the logical replication progresses from the primary side

At a higher level, this is exactly what this feature is doing:

  1. Automatically create/copy the replication slot information from the primary node of the Patroni cluster to all eligible standby nodes.
  2. Automatically advances the LSN numbers on slots of standby nodes as the LSN number advances on the corresponding slot on the primary.

After a Switchover/Failover

In the event of a switchover or failover, we are not losing any slot information as they are already maintained on the standby nodes.

After the switchover, the topology looks like this:

Now, any downstream logical replica can be repointed to the new primary.

This continues the replication, and pg_replication_slot information can confirm this.

Summary + Key Points

The logical replication slot is conceptually possible only on the primary Instance because that is where the logical decoding happens. Now with this improvement, Patroni makes sure that the slot information is available on standby also and it will be ready to take over the connection from the subscriber.

  • This solution requires PostgreSQL 11 or above because it uses the  pg_replication_slot_advance() function which is available from PostgreSQL 11 onwards, for advancing the slot.
  • The downstream connection can use HAProxy so that the connection will be automatically routed to the primary (not covered in this post). No modification to PostgreSQL code or Creation of any extension is required.
  • The copying of the slot happens over PostgreSQL protocol (libpq) rather than any OS-specific tools/methods. Patroni uses rewind or superuser credentials. Patroni uses the pg_read_binary_file()  function to read the slot information. Source code Reference.
  • Once the logical slot is created on the replica side, Patroni uses pg_replication_slot_advance() to move the slot forward.
  • The permanent slot information will be added to DCS and will be continuously maintained by the primary instance of the Patroni. A New DCS key with the name “status” is introduced and supported across all DCS options (zookeeper, etcd, consul, etc.).
  • hot_standby_feedback must be enabled on all standby nodes where the logical replication slot needs to be maintained.
  • Patroni parameter postgresql.use_slots must be enabled to make sure that every standby node uses a slot on the primary node.

 

You’ve chosen PostgreSQL for its flexibility, performance, and cost savings—but even experienced IT leaders can hit avoidable pitfalls along the way. Here’s what to look out for.

Enterprise PostgreSQL Buyer’s Guide

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved