Taking our cue from Peter Zaitsev’s article on MySQL Performance Bottlenecks, I’m going to talk a little about the PostgreSQL perspective of removing bottlenecks caused by high traffic.
Many stacks these days are implemented by trusting the Object Relational Mapper, ORM, to do the right thing with PostgreSQL while one creates critical business logic on the application server process side of things. For the most part, this works out quite well but over time one must revisit the database server as it scales. As reliable as PostgreSQL is, it can experience slowdowns as traffic increases.
There are many ways of addressing performance bottlenecks, but for the sake of discussion here are several ways we can look at this issue:
Before we get too deep into the weeds, a word about monitoring. Yes, contrary to some people, all those metrics, values, and observations are not pulled out of thin air. Rather a rigorous approach to metrics monitoring is essential for identifying bottlenecks and their remedies. All the best advice in the world means nothing without context so in order to facilitate the process we’ve introduced a new tool called pg_stat_monitor.
Think of it as pg_stat_statements on steroids offering more granularity and, most significantly, introduces the concept of buckets, otherwise known as moving averages
A more detailed blog is forthcoming dedicated exclusively to this new technology in the meantime, here is an abridged set of instructions for installing this extension into your database.
|
1 |
apt update<br />apt upgrade -y<br /><br />apt install -y wget gnupg2 lsb-release curl<br />wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb<br />dpkg -i percona-release_latest.generic_all.deb<br /><br />percona-release setup ppg-12<br />apt install -y percona-postgresql-12 percona-pg-stat-monitor12 |
|
1 |
# the following assumes the community version of postgres is already installed<br><br>apt install -y postgresql-server-dev-12 git make gcc<br />git clone https://github.com/percona/pg_stat_monitor<br /><br />cd pg_stat_monitor<br />make USE_PGXS=1<br />make USE_PGXS=1 install |
|
1 |
alter system set shared_preload_libraries = 'pg_stat_monitor';<br />create extension pg_stat_monitor; |
|
1 |
systemctl restart postgresql |
Complexity refers to the level of difficulty in implementing a particular solution. Whereas potential impact gives you an idea of the mitigation’s ideal impact on your system’s performance. Sometimes though, because of its age, the type of system, its technical debt, etc., accurately describing the complexity and its potential impact can be problematic. At the end of it all, given highly complex environments, it’s your judgment that makes the final call.
Categories
Complexity: Low
Potential Impact: High
There was a time not too long ago when a modern version of postgres could still run on an i386. Although the default settings have since been updated the parameters are still set to use the least amount of resources upon its installation.
These settings are the easiest to set and are normally updated when the service is first installed. Not adjusting these values can result in high CPU and IO:
So let’s talk about these variables.
The effective cache recommended value, although typical, can be set precisely by referring to “top” i.e. free+cached RAM.
Setting the shared buffers is an interesting puzzle. There are two ways of looking at this setting: assuming you have a small database one can set the shared buffers high enough that one essentially has a RAM resident database system. Otherwise, one configures loading those tables and indexes, used the most often by the service, to remain in RAM (the old 80/20 rule). Setting 1/3 system RAM used to be the recommended setting but over time it was dropped to 1/4 as machines acquired more RAM because there is such a thing as having too much assigned to shared buffers. Too little RAM means more CPU work and higher IO. You’ll know when the shared buffer setting is too high when CPU load and IO performance reaches a plateau.

Another factor to consider is the OS cache; given enough RAM, Linux will cache tables and indexes in RAM and can, depending upon various settings, trick PostgreSQL into believing that it’s reading from disk rather from RAM. Performance improves at the expense of increased redundancy by often copying the same pages found in the shared buffer into the OS cache, which is another reason to avoid too large a shared buffer cache. For the perfectionists among us, take a look at the extension pg_buffercache which reads cache usage in real-time (TIP: look at this table).
Setting the work_mem too low guarantees poor performance as sorts are processed as temporary files on disk. On the other hand, although setting it high doesn’t affect performance it does risk starving the server of RAM if too many connections are active at any one time. A chunk of RAM equal to the work mem is used for each and every sort operation. You’ll need to do a bit of arithmetic counting the instances RAM is used by each query and session. TIP: use EXPLAIN ANALYZE to see where the sort operations are carried out and by varying the value in a given session one can see when it spills to disk.
One can learn quite a bit by benchmarking the system at discreet variable settings.
Complexity: Low
Potential Impact: Low-Medium-High
High Traffic is often typified as large numbers of connections made over short intervals of time. Too many connections block processes and can delay query response and can even cause session errors. Without exception handling root cause analysis may not be easily determined without digging into the postgres logs.
An easy fix is increasing the number of connections:
|
1 |
# postgresql.conf: default is set to 100<br />max_connections |

Alternatively, a more sophisticated approach is connection pooling. There are numerous solutions but the most commonly used technology is pgbouncer. Among its many capabilities pgbouncer can regulate connection sessions using one of three modes:
The Secure Socket Layer, SSL, is another consideration. When configured to use SSL certificates, PostgreSQL default behavior encourages all connecting sessions to use SSL consequently consuming more CPU processing power than an unencrypted session. One can configure the host-based authentication rules, pg_hba.conf, forcing ordinary client sessions not to use SSL and instead reserve its use for administrative tasks, by the superuser, or with streaming replication.
Complexity: Medium
Potential Impact: Low-Medium
Multi-Version Concurrency Control is one of the foundational principles making PostgreSQL such a popular DBMS solution. However, one of the unfortunate downsides is that for every updated or deleted record a dead-tuple is created which must eventually be purged. An improperly tuned autovacuum process, which is the mechanism dealing with dead-tuples, reduces performance i.e. the busier the server the more significant the effect.
One manages the autovacuum daemon, using these three (3) parameters:
Beware over-committing RAM and CPU. The higher the value initially set the greater the amount of consumed resources risk being exhausted as system loading increases. Set too high one can experience a sudden drop in performance when loading exceeds a certain point. TIP: Similar to the allocation of RAM regarding work_mem one can either perform some arithmetic or bench-mark the environment in order to set the values optimally.
Complexity: High
Potential Impact: High
Because of the amount of effort involved, one should consider this method when the database system risks pushing the host to its physical limits, and excessive bloat is identified as an issue.
Editing the autovacuum runtime parameters in postgresql.conf is the most common method used to control its behavior for a datacluster. Unfortunately, this one size fits all approach may not work well over the long term especially as a system scales.
Table Storage Parameters: Often there will be tables whose activities represent a significant amount of the total datacluster churn. Adjusting the various autovacuum parameters on a table by table basis is an excellent way of mitigating hyper-active relations without resorting to a manual invocation of VACUUM which can be significantly impactful upon the system.
Tune individual tables using this SQL COMMAND.
|
1 |
ALTER TABLE .. SET STORAGE_PARAMETER |
Complexity: Low
Potential Impact: Medium-High
Over time, even with the best of intentions, or not, performance can degrade due to inadequate vacuuming policies causing excessive bloating that even tuning the autovacuum daemon and manually invoking VACUUM will not easily resolve. For these cases, the pg_repack extension comes to the rescue.
pg_repack: rebuild and reorganize tables and indexes under production conditions
Complexity: High
Potential Impact: Low-Medium-High
Similar to MySQL HotSpots, the PostgreSQL experience, and its resolution, of hot spots relies upon one’s extensive knowledge of the data flow and can, at its most extreme mitigation, refactor the system’s architecture.
Here are a few of the more popular mitigation techniques:

Complexity: Low
Potential Impact: High
PHP+Java+Python applications: Avoid running applications and postgres on the same host. Back in the old days, one could easily combine a web-service and RDBMS on the same machine since their usage of resources was complimentary. Beware of applications based upon these languages since they can consume large amounts of RAM, especially the garbage collection, which then competes with database systems reducing its overall performance.
Complexity: Low
Potential Impact: High
async vs sync: The most recent versions of postgres support logical and streaming replication in both synchronous and asynchronous modes. Although the default replication mode is async one must consider the implications of using sync replication especially over network connections with less than ideal latency.
Last, but not least, are the considerations of the most basic sort i.e. make the host bigger and better. Let’s review what each of the following resources brings by way of performance enhancements to PostgreSQL:


Our white paper “Why Choose PostgreSQL?” looks at the features and benefits of PostgreSQL and presents some practical usage examples. We also examine how PostgreSQL can be useful for companies looking to migrate from Oracle.
Resources
RELATED POSTS