PostgreSQL streaming physical replication with slots simplifies setup and maintenance procedures. Usually, you should estimate disk usage for the Write Ahead Log (WAL) and provide appropriate limitation to the number of segments and setup of the WAL archive procedure. In this article, you will see how to use replication with slots and understand what problems it could solve.
PostgreSQL physical replication is based on WAL. Th Write Ahead Log contains all database changes, saved in 16MB segment files. Normally postgres tries to keep segments between checkpoints. So with default settings, just 1GB of WAL segment files is available.
Replication requires all WAL files created after backup and up until the current time. Previously, it was necessary to keep a huge archive directory (usually mounted by NFS to all slave servers). The slots feature introduced in 9.4 allows Postgres to track the latest segment downloaded by a slave server. Now, PostgreSQL can keep all segments on disk, even without archiving, if a slave is seriously behind its master due to downtime or networking issues. The drawback: the disk space could be consumed infinitely in the case of configuration error. Before continuing, if you need a better understanding of physical replication and streaming replication, I recommend you read “Streaming Replication with PostgreSQL“.
To setup replication, you need at least two PostgreSQL servers. I’m using pgcli (pgc) to setup both servers on the same host. It’s easy to install on Linux, Windows, and OS X, and provides the ability to download and run any version of PostgreSQL on your staging server or even on your laptop.
|
1 |
python -c "$(curl -fsSL https://s3.amazonaws.com/pgcentral/install.py)"<br>mv bigsql master<br>cp -r master slave<br>$ cd master<br>master$ ./pgc install pg10<br>master$ ./pgc start pg10<br>$ cd ../slave<br>slave$ ./pgc install pg10<br>slave$ ./pgc start pg10 |
First of all you should allow the replication user to connect:
|
1 |
master$ echo "host replication replicator 127.0.0.1/32 md5" >> ./data/pg10/pg_hba.conf |
If you are running master and slave on different servers, please replace 127.0.0.1 with the slave’s address.
Next pgc creates a shell environment file with PATH and all the other variables required for PostgreSQL:
|
1 |
master$ source ./pg10/pg10.env |
Allow connections from the remote host, and create a replication user and slot on master:
|
1 |
master$ psql<br>postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicator';<br>CREATE ROLE<br>postgres=# ALTER SYSTEM SET listen_addresses TO '*';<br>ALTER SYSTEM<br>postgres=# SELECT pg_create_physical_replication_slot('slot1');<br>pg_create_physical_replication_slot <br>-------------------------------------<br>(slot1,) |
To apply system variables changes and hba.conf, restart the Postgres server:
|
1 |
master$ ./pgc stop ; ./pgc start<br>pg10 stopping<br>pg10 starting on port 5432 |
Create a table with lots of padding on the master:
|
1 |
master$ psql psql (10.6) Type "help" for help. <br>postgres=# CREATE TABLE t(id INT, pad CHAR(200)); <br>postgres=# CREATE INDEX t_id ON t (id); <br>postgres=# INSERT INTO t SELECT generate_series(1,1000000) AS id, md5((random()*1000000)::text) AS pad; |
To see the benefits of slots, we should fill the WAL with some data by running transactions. Repeat the update statement below to generate a huge amount of WAL data:
|
1 |
UPDATE t SET pad = md5((random()*1000000)::text); |
You can check total size for all WAL segments from the shell or from psql:
|
1 |
master$ du -sh data/pg10/pg_wal<br>17M data/pg10/pg_wal<br>master$ source ./pg10/pg10.env<br>master$ psql<br>postgres=# ! du -sh data/pg10/pg_wal<br>17M data/pg10/pg_wal |
Before replication configuration, we can fill the WAL with random data and find that after 1.1G, the data/pg10/pg_wal directory size does not increase regardless of the number of update queries.
|
1 |
postgres=# UPDATE t SET pad = md5((random()*1000000)::text); -- repeat 4 times<br>postgres=# ! du -sh data/pg10/pg_wal<br>1.1G data/pg10/pg_wal<br>postgres=# UPDATE t SET pad = md5((random()*1000000)::text);<br>postgres=# ! du -sh data/pg10/pg_wal<br>1.1G data/pg10/pg_wal |
Next, let’s make a backup for our slot1:
|
1 |
slave$ source ./pg10/pg10.env<br>slave$ ./pgc stop pg10<br>slave$ rm -rf data/pg10/*<br># If you are running master and slave on different servers, replace 127.0.0.1 with master's IP address.<br>slave$ PGPASSWORD=replicator pg_basebackup -S slot1 -h 127.0.0.1 -U replicator -p 5432 -D $PGDATA -Fp -P -Xs -Rv |
Unfortunately pg_basebackup hangs with: initiating base backup, waiting for checkpoint to complete.
We can wait for the next checkpoint, or force the checkpoint on the master. Checkpoint happens every checkpoint_timeout seconds, and is set to five minutes by default.
Forcing checkpoint on master:
|
1 |
master$ psql<br>postgres=# CHECKPOINT; |
The backup continues on the slave side:
|
1 |
pg_basebackup: checkpoint completed<br>pg_basebackup: write-ahead log start point: 0/92000148 on timeline 1<br>pg_basebackup: starting background WAL receiver<br>1073986/1073986 kB (100%), 1/1 tablespace <br>pg_basebackup: write-ahead log end point: 0/927FDDE8<br>pg_basebackup: waiting for background process to finish streaming ...<br>pg_basebackup: base backup completed |
The backup copies settings from the master, including its TCP port value. I’m running both master and slave on the same host, so I should change the port in the slave .conf file:
|
1 |
slave$ vim data/pg10/postgresql.conf<br># old value port = 5432<br>port = 5433 |
Now we can return to the master and run some queries:
|
1 |
slave$ cd ../master<br>master$ source pg10/pg10.env<br>master$ psql<br>postgres=# UPDATE t SET pad = md5((random()*1000000)::text);<br>UPDATE t SET pad = md5((random()*1000000)::text); |
By running these queries, the WAL size is now 1.4G, and it’s bigger than 1.1G! Repeat this update query three times and the WAL grows to 2.8GB:
|
1 |
master$ du -sh data/pg10/pg_wal<br>2.8G data/pg10/pg_wal |
Certainly, the WAL could grow infinitely until whole disk space is consumed.
How do we find out the reason for this?
|
1 |
postgres=# SELECT redo_lsn, slot_name,restart_lsn, <br>round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind <br>FROM pg_control_checkpoint(), pg_replication_slots;<br>redo_lsn | slot_name | restart_lsn | gb_behind <br>------------+-----------+-------------+-----------<br>1/2A400630 | slot1 | 0/92000000 | 2.38 |
We have one slot behind the master of 2.38GB.
Let’s repeat the update and check again. The gap has increased:
|
1 |
postgres=# postgres=# SELECT redo_lsn, slot_name,restart_lsn, <br>round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind <br>FROM pg_control_checkpoint(), pg_replication_slots;<br>redo_lsn | slot_name | restart_lsn | gb_behind <br>------------+-----------+-------------+-----------<br>1/8D400238 | slot1 | 0/92000000 | 3.93 |
Wait, though: we have already used slot1 for backup! Let’s start the slave:
|
1 |
master$ cd ../slave<br>slave$ ./pgc start pg10 |
Replication started without any additional change to recovery.conf:
|
1 |
slave$ cat data/pg10/recovery.conf <br>standby_mode = 'on'<br>primary_conninfo = 'user=replicator password=replicator passfile=''/home/pguser/.pgpass'' host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres target_session_attrs=any'<br>primary_slot_name = 'slot1' |
pg_basebackup -R option instructs backup to write to the recovery.conf file with all required options, including primary_slot_name.
The gap reduced several seconds after the slave started:
|
1 |
postgres=# SELECT redo_lsn, slot_name,restart_lsn, <br>round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind <br>FROM pg_control_checkpoint(), pg_replication_slots;<br>redo_lsn | slot_name | restart_lsn | gb_behind <br>------------+-----------+-------------+-----------<br> 1/8D400238 | slot1 | 0/9A000000 | 3.80 |
And a few minutes later:
|
1 |
postgres=# SELECT redo_lsn, slot_name,restart_lsn, <br>round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind <br>FROM pg_control_checkpoint(), pg_replication_slots;<br>redo_lsn | slot_name | restart_lsn | gb_behind <br>------------+-----------+-------------+-----------<br> 1/9E5DECE0 | slot1 | 1/9EB17250 | -0.01<br>postgres=# !du -sh data/pg10/pg_wal<br>1.3G data/pg10/pg_wal |
Let’s simulate slave server maintenance with ./pgc stop pg10 executed on the slave. We’ll push some data onto the master again (execute the UPDATE query 4 times).
Now, “slot1” is again 2.36GB behind.
By now, you might realize that a problematic slot is not in use. In such cases, you can drop it to allow retention for segments:
|
1 |
master$ psql<br>postgres=# SELECT pg_drop_replication_slot('slot1'); |
Finally the disk space is released:
|
1 |
master$ du -sh data/pg10/pg_wal<br>1.1G data/pg10/pg_wal |
replica by default
Percona and PostgreSQL work better together. Try Percona Distribution for PostgreSQL today.
Unlock the secrets to cost-efficient database management – Read our eBook now and start saving!
Resources
RELATED POSTS