There are several approaches to assess when you need to upgrade PostgreSQL. In this blog post, we look at the option for upgrading a postgres database using pg_dumpall. As this tool can also be used to back up PostgreSQL clusters, then it is a valid option for upgrading a cluster too. We consider the advantages and disadvantages of this approach, and show you the steps needed to achieve the upgrade.
This is the first of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series where we’ll be exploring different paths to accomplish postgres upgrade or migration. The series will culminate with a practical webinar to be aired April 17th (you can register here).
We begin this journey by providing you the most straightforward way to carry on with a PostgreSQL upgrade or migration: by rebuilding the entire database from a logical backup.
Defining the scope
Let’s define what we mean by upgrading or migrating PostgreSQL using pg_dumpall.
If you need to perform a PostgreSQL upgrade within the same database server, we’d call that an in-place upgrade or just an upgrade. Whereas a procedure that involves migrating your PostgreSQL server from one server to another server, combined with an upgrade from an older version (let’s say 9.3) to a newer version PostgreSQL (say PG 11.2), can be considered a migration.
There are two ways to achieve this requirement using logical backups :
- Using pg_dumpall
- Using pg_dumpall + pg_dump + pg_restore
We’ll be discussing the first option (pg_dumpall) here, and will leave the discussion of the second option for our next post.
pg_dumpall can be used to obtain a text-format dump of the whole database cluster, and which includes all databases in the cluster. This is the only method that can be used to backup globals such as users and roles in PostgreSQL.
There are, of course, advantages and disadvantages in employing this approach to upgrading PostgreSQL by rebuilding the database cluster using pg_dumpall.
Advantages of using pg_dumpall for upgrading a PostgreSQL server :
- Works well for a tiny database cluster.
- Upgrade can be completed using just a few commands.
- Removes bloat from all the tables and shrinks the tables to their absolute sizes.
Disadvantages of using pg_dumpall for upgrading a PostgreSQL server :
- Not the best option for databases that are huge in size as it might involve more downtime. (Several GB’s or TB’s).
- Cannot use parallel mode. Backup/restore can use just one process.
- Requires double the space on disk as it involves temporarily creating a copy of the database cluster for an in-place upgrade.
Let’s look at the steps involved in performing an upgrade using pg_dumpall:
- Install new PostgreSQL binaries in the target server (which could be the same one as the source database server if it is an in-place upgrade).
Shell12345-- For a RedHat family OS# yum install postgresql11*Or-- In an Ubuntu/Debian OS# apt install postgresql11
- Shutdown all the writes to the database server to avoid data loss/mismatch between the old and new version after upgrade.
- If you are doing an upgrade within the same server, create a cluster using the new binaries on a new data directory and start it using a port other than the source. For example, if the older version PostgreSQL is running on port 5432, start the new cluster on port 5433. If you are upgrading and migrating the database to a different server, create a new cluster using new binaries on the target server – the cluster may not need to run on a different port other than the default, unless that’s your preference.
Shell1234$ /usr/pgsql-11/bin/initdb -D new_data_directory$ cd new_data_directory$ echo “port = 5433” >> postgresql.auto.conf$ /usr/pgsql-11/bin/pg_ctl -D new_data_directory start
- You might have a few extensions installed in the old version PostgreSQL cluster. Get the list of all the extensions created in the source database server and install them for the new versions. You can exclude those you get with the contrib module by default. To see the list of extensions created and installed in your database server, you can run the following command.
Shell1$ psql -d dbname -c "\dx"
Please make sure to check all the databases in the cluster as the extensions you see in one database may not match the list of those created in another database.
- Prepare a postgresql.conf file for the new cluster. Carefully prepare this by looking at the existing configuration file of the older version postgres server.
- Use pg_dumpall to take a cluster backup and restore it to the new cluster.
Shell12345-- Command to dump the whole cluster to a file.$ /usr/pgsql-11/bin/pg_dumpall > /tmp/dumpall.sql-- Command to restore the dump file to the new cluster (assuming it is running on port 5433 of the same server).$ /usr/pgsql-11/bin/psql -p 5433 -f /tmp/dumpall.sql
Note that i have used the new pg_dumpall from the new binaries to take a backup.
Another, easier, way is to use PIPE to avoid the time involved in creating a dump file. Just add a hostname if you are performing an upgrade and migration.
Shell123$ pg_dumpall -p 5432 | psql -p 5433Or$ pg_dumpall -p 5432 -h source_server | psql -p 5433 -h target_server
- Run ANALYZE to update statistics of each database on the new server.
- Restart the database server using the same port as the source.
Our next post in this series provides a similar way of upgrading your PostgreSQL server while at the same time providing some flexibility to carry on with changes like the ones described above. Stay tuned!