This is the last post in our series on building an enterprise-grade PostgreSQL set up using open source tools, and we’ll be covering monitoring.
The previous posts in this series discussed aspects such as security, backup strategy, high availability, connection pooling and load balancing, extensions, and detailed logging in PostgreSQL. Tomorrow, Wednesday, October 10 at 10AM EST, we will be reviewing these topics together, and showcasing then in practice in a webinar format: we hope you can join us!
The importance of monitoring the activity and health of production systems is unquestionable. When it comes to the database, with its high number of customizable settings, the ability to track its various metrics (status counters and gauges) allows for the maintenance of a historical record of its performance over time. This can be used for capacity planning, troubleshooting and validation.
When it comes to capacity planning, a monitoring solution is a helpful tool to help you assess how the current setup is faring. At the same time, it can help predict future needs based on trends, such as the increase of active connections, queries, and CPU usage. For example, an increase in CPU usage might be due to a genuine increase in workload, but it could also be a sign of unoptimized queries growing in popularity. In which case, comparing CPU with disk access might provide a more complete view of what is going on.
Being able to easily correlate data like this helps you to catch minor issues and to plan accordingly, sometimes allowing you to avoid an easier but more costly solution of scaling up to mitigate problems like this. But having the right monitoring solution is really invaluable when it comes to investigative work and root cause analysis. Trying to understand a problem that has already taken place is a rather complicated, and often unenviable, task unless you established a continuous, watchful eye on the set up for the whole time.
Finally, a monitoring solution can help you validate changes made in the business logic in general or in the database configuration in specific. By comparing prior and post results for a given metric or for overall performance, you can observe the impact of such changes in practice.
Monitoring PostgreSQL with open source solutions
There is a number of monitoring solutions for PostgreSQL and postgresql.org’s Wiki provides an extensive list, albeit a little outdated. It categorizes the main monitoring solutions into two distinct categories: those that can be identified as generic solutions—and can be extended to cover different technologies through custom plugins—and those labeled as Postgres-centric, which are specific to PostgreSQL.
In the first group, we find venerated open source monitoring tools such as Munin, Zabbix, and Cacti. Nagios could have also been added to this group but it was instead indirectly included in the “Checkers” group. That category includes monitoring scripts that can be used both in stand-alone mode or as feeders (plugins) for “Nagios like software“. Examples of these are check_pgactivity and check_postgres.
One omission from this list is Grafana, a modern time series analytics platform conceived to display metrics from a number of different data sources. Grafana includes a solution packaged as a PostgreSQL native plugin. Percona has built its Percona Monitoring and Management (PMM) platform around Grafana, using Prometheus as its data source. Since version 1.14.0, PMM supports PostgreSQL. Query Analytics (QAN) integration is coming soon.
An important factor that all these generic solutions have in common is that they are widely used for the monitoring of a diverse collection of services, like you’d normally find in enterprise-like environments. It’s common for a given company to adopt one, or sometimes two, such solutions with the aim of monitoring their entire infrastructure. This infrastructure often includes a heterogeneous combination of databases and application servers.
Nevertheless, there is a place for complementary Postgres-centric monitoring solutions in such enterprise environments too. These solutions are usually implemented with a specific goal in mind. Two examples we can mention in this context are PGObserver, which has a focus on monitoring stored procedures, and pgCluu, with its focus on auditing.
Monitoring PostgreSQL with PMM
We built an enterprise-grade PostgreSQL set up for the webinar, and use PMM for monitoring. We will be showcasing some of PMM’s main features, and highlighting some of the most important metrics to watch, during our demo.You may want to have a look at this demo setup to get a feel of how our PostgreSQL Overview dashboard looks: