This post was originally published in 2023 and was updated in 2025.
PostgreSQL is a natural fit for modern cloud-native environments, but running it on Kubernetes can be more complicated than it looks. From provisioning to backups, there’s a lot to get right before you can start building on top of it. That’s where the Percona Operator for PostgreSQL comes in. It removes much of the manual work by automating deployment and cluster management, making it easier to get a production-ready database up and running.
A common need, especially in CI/CD pipelines, is to bootstrap clusters with data so applications can run right away. In this post, we’ll look at how to use the Percona Operator’s bootstrap features to:
You’ll need the Percona Operator for PostgreSQL deployed. Follow our installation instructions using whichever method you prefer.
You can find all examples from this post in this GitHub repository. A single command to deploy the operator would be:
|
1 |
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/bootstrap-postgresql-k8s/00-bundle.yaml --server-side |
Init SQL lets you create a database cluster with some initial data. Everything is created with the postgres admin user. The process works like this:
This is often paired with user creation.
Example ConfigMap from 01-demo-init.yaml:
The init.sql script:
User and database creation in 02-deploy-cr.yaml:
|
1 |
users:<br> - name: myuser<br> databases:<br> - demo-db |
Reference the ConfigMap in the custom resource:
|
1 |
databaseInitSQL:<br> key: init.sql<br> name: demo-cluster-init |
Applying the manifest would do the trick:
|
1 |
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/bootstrap-postgresql-k8s/02-deploy-cr.yaml |
To verify if init SQL was executed or check errors, look at the Operator’s logs. Search for init SQL. For example, the following tells me that I had a syntax error in my SQL script for demo-cluster:
|
1 |
$ kubectl logs --tail=2000 percona-postgresql-operator-6f96ffd8d4-ddzth | grep 'init SQL'<br>time="2023-08-14T09:37:37Z" level=debug msg="applied init SQL" PostgresCluster=default/demo-cluster controller=postgrescluster controllerKind=PostgresCluster key=init.sql name=demo-cluster-init namespace=default reconcileID=1d0cfdcc-0464-459a-be6e-b25eb46ed2c9 stderr="psql:<stdin>:11: ERROR: syntax error at or near "KEYS"nLINE 2: ID INT PRIMARY KEYS NOT NULL,n ^n" stdout="You are now connected to database "demo-db" as user "postgres".nCREATE SCHEMAnCREATE TABLEn" version=<br> |
ConfigMaps can’t store more than 1 MB of data, so init SQL is best for small bootstraps. For larger datasets, you can bootstrap from:
You’ll need a running cluster and a pgBackRest repo configured.
03-deploy-cr2.yaml provisions demo-cluster-2. The spec.databaseInitSQL section is removed, but spec.users remains. Add the dataSource section:
|
1 |
dataSource:<br> postgresCluster:<br> clusterName: demo-cluster<br> repoName: repo1 |
The new cluster will be created once the manifest is applied:
|
1 |
$ kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/bootstrap-postgresql-k8s/03-deploy-cr2.yaml<br>$ kubectl get pg<br>NAME ENDPOINT STATUS POSTGRES PGBOUNCER AGE<br>demo-cluster demo-cluster-pgbouncer.default.svc ready 1 1 14m<br>demo-cluster-2 demo-cluster-2-pgbouncer.default.svc ready 1 1 13m |
demo-cluster-2 will have the same data as demo-cluster. Keep in mind that even if data is the same, the user passwords would be different by default. You can change this; please see users documentation.
If the original cluster isn’t running—or is in a different Kubernetes environment—you can bootstrap from backups stored in object storage. Please use our documentation to configure backups.
Example 04-deploy-cr.yaml config for Google Cloud Storage (GCS):
|
1 |
pgbackrest:<br> global:<br> - secret:<br> name: demo-cluster-gcs<br>...<br> repos:<br> - name: repo1<br> schedules:<br> full: "0 0 * * 6"<br> gcs:<br> bucket: "my-demo-bucket" |
Once you have backups stored in the object storage, you can delete the cluster and reference it in the manifest anytime for bootstrapping. For example, in 05-deploy-cr3.yaml, dataSource section looks like this:
|
1 |
dataSource:<br> pgbackrest:<br> stanza: db<br> configuration:<br> - secret:<br> name: demo-cluster-gcs<br> global:<br> repo1-path: /pgbackrest/demo/repo1<br> repo:<br> name: repo1<br> gcs:<br> bucket: "my-demo-bucket" |
The fields have the same structure and reference the same Secret resource where GCS configuration is stored.
When bootstrapping from pgBackRest, the Operator creates a restore pod. If it fails:
|
1 |
$ kubectl get pods<br>NAME READY STATUS RESTARTS AGE<br>demo-cluster-3-pgbackrest-restore-74dg5 0/1 Error 0 27s<br>$ kubectl logs demo-cluster-3-pgbackrest-restore-74dg5<br>Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)<br>+ pgbackrest restore --stanza=db --pg1-path=/pgdata/pg15 --repo=1 --delta --link-map=pg_wal=/pgdata/pg15_wal<br>WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory<br> NOTE: process will continue without log file.<br>WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg15' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.<br>WARN: repo1: [FileMissingError] unable to load info file '/pgbackrest/demo/repo1/backup/db/backup.info' or '/pgbackrest/demo/repo1/backup/db/backup.info.copy':<br> FileMissingError: unable to open missing file '/pgbackrest/demo/repo1/backup/db/backup.info' for read<br> FileMissingError: unable to open missing file '/pgbackrest/demo/repo1/backup/db/backup.info.copy' for read<br> HINT: backup.info cannot be opened and is required to perform a backup.<br> HINT: has a stanza-create been performed?<br>ERROR: [075]: no backup set found to restore |
Check for missing files, misconfigured stanzas, or object storage issues.
Bootstrapping PostgreSQL clusters with the Percona Operator saves significant time, whether you’re initializing with SQL, cloning from an existing cluster, or restoring from backups. It also fits neatly into CI/CD pipelines, letting teams automate provisioning, updates, and rollbacks with fewer risks and less downtime.
If you want to go further, we’ve put together a resource that shows how Percona makes PostgreSQL on Kubernetes easier. It covers high availability, observability, automation, and more, so you can run PostgreSQL at scale with confidence.