These scripts demonstrate the command line interface, CLI, “pg_rewind” re-provisioning a formerly deprecated PRIMARY host, pg1, into a viable STANDBY. Although the scripts have been written expressly for a custom-made set of two Linux containers using CENTOS7, it has been written in a form that is hopefully clear and concise enough that they can be of use to you and can be customized to suit your particular needs.
There are five scripts:
This script starts by destroying preexisting containers and rebuilding them from an already preexisting image that was custom-made, with postgres already installed. Although not necessary, the IP addresses for each container were hard-coded. Both containers have the postgres 12 binaries installed, with an already initialized data cluster.
|
1 |
#!/bin/bash<br>set -e<br><br>echo "=== stage 1 ==="<br><br>for u in pg1 pg2<br>do<br> set +e<br> lxc stop $u 2>/dev/null<br> lxc rm $u<br> lxc launch template-CENTOS7-pg $u<br> set -e<br>done<br><br>echo "=== stage 2 ==="<br><br>for u in pg1 pg2<br>do<br> lxc exec $u bash <<_eof_<br> hostnamectl set-hostname $u<br> rm -f /var/lib/pgsql/12/data/log/*<br>_eof_<br>done<br><br>for u in pg1 pg2<br>do<br> lxc stop $u<br>done<br><br>echo "=== stage 3 ==="<br><br>for u in pg1 pg2<br>do<br> lxc network attach lxdbr0 $u eth0<br>done<br><br>echo "=== stage 4 ==="<br><br>echo "sleeping ..." && sleep 2s<br>lxc config device set pg1 eth0 ipv4.address 10.231.38.111<br>lxc config device set pg2 eth0 ipv4.address 10.231.38.112<br><br>for u in pg1 pg2<br>do<br> lxc start $u<br>done<br><br>echo "sleeping ..." && sleep 8s<br>lxc ls -cn4 pg<br><br>echo "DONE" |
The two postgres configuration files, pg_hba.conf and postgresql.conf, are configured for remote access with verbose logging providing plenty of information and is set for replication service.
While all the runtime parameters are important, pay special attention to these three runtime parameters:
The CLI pg_receivewal is useful in a script because it can test if a slot has already been created, thus pg_basebackup simply looks for the slot and is not required to create one.
|
1 |
#!/bin/bash<br>set -e<br><br>echo "=== stage 1 ==="<br><br>lxc exec pg1 bash <<_eof_<br><br>su - postgres <<_eof1_<br><br>source /etc/profile.d/sh.local<br><br>echo "<br># TYPE DATABASE USER ADDRESS METHOD<br># "local" is for Unix domain socket connections only<br>local all all trust<br># IPv4 local connections:<br>host all all 127.0.0.1/32 md5<br>host all all 0.0.0.0/0 md5<br># IPv6 local connections:<br>host all all ::1/128 md5<br>host all all ::0/0 md5<br># Allow replication connections from localhost, by a user with the<br># replication privilege.<br>local replication all trust<br>host replication all 127.0.0.1/32 md5<br>host replication all 0.0.0.0/0 md5<br>host replication all ::1/128 md5<br>host replication all ::0/0 md5<br>" > /var/lib/pgsql/12/data/pg_hba.conf<br><br>echo "<br>archive_command='/bin/true'<br>archive_mode='on'<br>autovacuum_freeze_max_age=200000000<br>autovacuum_multixact_freeze_max_age=400000000<br>deadlock_timeout='1s'<br>hot_standby_feedback='on'<br>idle_in_transaction_session_timeout=0<br>listen_addresses='*'<br>lock_timeout=0<br>log_autovacuum_min_duration=0<br>log_checkpoints='on'<br>log_connections='on'<br>log_disconnections='on'<br>log_duration='on'<br>log_filename='postgresql-%a.log'<br>logging_collector='on'<br>log_hostname='on'<br>log_line_prefix=' app=%a user=%u error=%e line=%l vtxid=%v txid=%x tstamp=%m [%p] '<br>log_lock_waits='on'<br>log_min_duration_statement=0<br>log_replication_commands='on'<br>log_rotation_size=0<br>log_statement='ddl'<br>log_temp_files=0<br>log_truncate_on_rotation='on'<br>max_standby_archive_delay='30s'<br>max_standby_streaming_delay='30s'<br>old_snapshot_threshold=-1<br>statement_timeout=0<br>track_activity_query_size=2048<br>track_commit_timestamp='on'<br>track_functions='pl'<br>track_io_timing='on'<br>vacuum_defer_cleanup_age=0<br>wal_keep_segments=60<br>wal_log_hints='on'<br>wal_receiver_timeout='60s'<br>wal_sender_timeout='60s'<br>" > /var/lib/pgsql/12/data/postgresql.auto.conf<br>_eof1_<br>systemctl start postgresql-12<br>_eof_<br><br>echo "=== stage 2 ==="<br><br>lxc exec pg2 bash <<_eof_<br>su - postgres <<_eof1_<br> rm -rf /var/lib/pgsql/12/data/<br><br> /usr/pgsql-12/bin/pg_receivewal <br> --slot=pg2 <br> --create-slot <br> --if-not-exists <br> -d 'host=pg1 user=postgres password=postgres'<br><br> /usr/pgsql-12/bin/pg_basebackup <br> -d 'host=pg1 user=postgres dbname=postgres' <br> -D /var/lib/pgsql/12/data/ <br> -l initial_backup_pg2 <br> -c fast <br> -R -P <br> -S pg2<br><br> rm -f /var/lib/pgsql/12/data/log/*.log<br>_eof1_<br>systemctl start postgresql-12 && sleep 2s<br>_eof_<br> |
It’s more accurate to call this a switchover than a failover. Notice that a new WAL is generated on pg1 before it’s stopped.
Promoting the STANDBY, pg2, is extremely easy in postgres these days. Just log in as the superuser and execute the SQL “select pg_promote()”.
The final step to this script creates and populates a table with some records.
|
1 |
#!/bin/bash<br>set -e<br><br>echo "=== stage 1 ==="<br><br>lxc exec pg1 bash <<_eof_<br><br>psql 'host=pg1 user=postgres password=postgres' <<_eof1_<br> checkpoint;<br> select * from pg_walfile_name(pg_switch_wal());<br>_eof1_<br><br>systemctl stop postgresql-12 && sleep 2s<br>_eof_<br><br>echo "=== stage 2 ==="<br><br>lxc exec pg2 bash <<_eof_<br><br>psql 'host=pg2 user=postgres password=postgres' <<_eof1_<br> select pg_promote();<br> drop database if exists db01;<br> create database db01;<br> c 'host=pg2 user=postgres password=postgres dbname=db01'<br><br> select *,'hello world'::text as comments into t1<br> from (select * from generate_series(1,1E6))t;<br>_eof1_<br>_eof_<br><br>echo "Host pg1 is shutdown and host pg2 is promoted as a read-write node"<br>echo "DONE"<br> |
And here’s where the magic takes place.
Just before rewinding pg1 begins, a slot is explicitly added to PRIMARY host pg2. Although in this particular instance it isn’t really needed to make the replication work, since runtime parameter “wal_keep_segments”, which no longer exists in postgres version 13, is configured to retain 60 WAL segments, more than enough WALS for this demonstration.
As soon as pg_rewind completes, host pg1 needs some final tuning preparing it for STANDBY service:
|
1 |
#!/bin/bash<br>set -e<br><br>lxc exec pg1 bash <<_eof_<br>/usr/pgsql-12/bin/pg_receivewal <br> --create-slot <br> --if-not-exists <br> --slot pg1 <br> -d 'host=pg2 user=postgres password=postgres'<br><br>su - postgres <<_eof1_<br>/usr/pgsql-12/bin/pg_rewind <br> --target-pgdata /var/lib/pgsql/12/data <br> --source-server='user=postgres password=postgres host=pg2'<br><br>echo "<br>hot_standby = 'on'<br>primary_conninfo = 'user=postgres password=postgres host=pg2 port=5432'<br>primary_slot_name = 'pg1'<br>recovery_target_timeline = 'latest'<br>" >> /var/lib/pgsql/12/data/postgresql.auto.conf<br><br>touch /var/lib/pgsql/12/data/standby.signal<br>_eof1_<br>systemctl start postgresql-12 && sleep 2s<br>_eof_<br><br>echo "DONE: Host pg1 is now a viable READ-ONLY STANDBY to the PRIMARY host pg2"<br> |
Validation tests a number of conditions and includes:
|
1 |
#!/bin/bash<br>set -e<br><br>lxc exec pg1 bash <<_eof_<br><br>psql 'host=pg2 user=postgres password=postgres' <<_eof1_<br> qecho ---HOST PG2, READ-WRITE PRIMARY---<br> drop database if exists db_test;<br> create database db_test;<br> select pg_is_in_recovery from pg_is_in_recovery(),pg_sleep(3);<br> l<br>_eof1_<br><br>psql 'host=pg1 user=postgres password=postgres dbname=db01' <<_eof1_<br> qecho ---HOST PG1, READ-ONLY STANDBY---<br> select * from pg_is_in_recovery();<br> select count(*) from t1;<br> l<br>_eof1_<br>_eof_<br><br>echo "DONE"<br> |
Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.