PostgreSQL Extensions for an Enterprise-Grade SystemJobin Augustine
In this current series of blog posts we have been discussing various relevant aspects when building an enterprise-grade PostgreSQL setup, such as security, back up strategy, high availability, and different methods to scale PostgreSQL. In this blog post, we’ll get to review some of the most popular open source extensions for PostgreSQL, used to expand its capabilities and address specific needs. We’ll cover some of them during a demo in our upcoming webinar on October 10.
Expanding with PostgreSQL Extensions
PostgreSQL is one of the world’s most feature-rich and advanced open source RDBMSs. Its features are not just limited to those released by the community through major/minor releases. There are hundreds of additional features developed using the extensions capabilities in PostgreSQL, which can cater to needs of specific users. Some of these extensions are very popular and useful to build an enterprise-grade PostgreSQL environment. We previously blogged about a couple of FDW extensions (mysql_fdw and postgres_fdw ) which will allow PostgreSQL databases to talk to remote homogeneous/heterogeneous databases like PostgreSQL and MySQL, MongoDB, etc. We will now cover a few other additional extensions that can expand your PostgreSQL server capabilities.
The pg_stat_statements module provides a means for tracking execution statistics of all SQL statements executed by a server. The statistics gathered by the module are made available via a view named pg_stat_statements. This extension must be installed in each of the databases you want to track, and like many of the extensions in this list, it is available in the contrib package from the PostgreSQL PGDG repository.
Tables in PostgreSQL may end up with fragmentation and bloat due to the specific MVCC implementation in PostgreSQL, or simply due to a high number of rows being naturally removed. This could lead to not only unused space being held inside the table but also to sub-optimal execution of SQL statements. pg_repack is the most popular way to address this problem by reorganizing and repacking the table. It can reorganize the table’s content without placing an exclusive lock on it during the process. DMLs and queries can continue while repacking is happening. Version 1.2 of pg_repack introduces further new features of parallel index builds, and the ability to rebuild just the indexes. Please refer to the official documentation for more details.
PostgreSQL has a basic statement logging feature. It can be implemented using the standard logging facility with log_statement = all . But this is not sufficient for many audit requirements. One of the essential features for enterprise deployments is the capability for fine-grained auditing the user interactions/statements issued to the database. This is a major compliance requirement for many security standards. The pgaudit extension caters to these requirements.
The PostgreSQL Audit Extension (pgaudit) provides detailed session and/or object audit logging via the standard PostgreSQL logging facility. Please refer to the settings section of its official documentation for more details.
This is a must-have extension for developers who work on stored functions written in PL/pgSQL. This extension is well integrated with GUI tools like pgadmin, which allows developers to step through their code and debug it. Packages for pldebugger are also available in the PGDG repository and installation is straightforward.Once it is set up, we can step though and debug the code remotely.
The official git repo is available here
This is a wonderful extension for finding out where the code is slowing down. This is very helpful, particularly during complex migrations from proprietary databases, like from Oracle to PostgreSQL, which affect application performance. This extension can prepare a report on the overall execution time and tables representation, including flamegraphs, with clear information about each line of code. This extension is not, however, available from the PGDG repo: you will need to build it from source. Details on building and installing plprofiler will be covered in a future blog post. Meanwhile, the official repository and documentation is available here
PostGIS is arguably the most versatile implementation of the specifications of the Open Geospatial Consortium. We can see a large list of features in PostGIS that are rarely available in any other RDBMSs.
There are many users who have primarily opted to use PostgreSQL because of the features supported by PostGIS. In fact, all these features are not implemented as a single extension, but are instead delivered by a collection of extensions. This makes PostGIS one of the most complex extensions to build from source. Luckily, everything is available from the PGDG repository:
$ sudo yum install postgis24_10.x86_64
Once the postgis package is installed, we are able to create the extensions on our target database:
postgres=# CREATE EXTENSION postgis;
postgres=# CREATE EXTENSION postgis_topology;
postgres=# CREATE EXTENSION postgis_sfcgal;
postgres=# CREATE EXTENSION fuzzystrmatch;
postgres=# CREATE EXTENSION postgis_tiger_geocoder;
postgres=# CREATE EXTENSION address_standardizer;
Language Extensions : PL/Python, PL/Perl, PL/V8,PL/R etc.
Another powerful feature of PostgreSQL is its programming languages support. You can code database functions/procedures in pretty much every popular language.
cstore_fdw is an open source columnar store extension for PostgreSQL. Columnar stores provide notable benefits for analytics use cases where data is loaded in batches. Cstore_fdw’s columnar nature delivers performance by only reading relevant data from disk. It may compress data by 6 to 10 times to reduce space requirements for data archive. The official repository and documentation is available here
HypoPG is an extension for adding support for hypothetical indexes – that is, without actually adding the index. This helps us to answer questions such as “how will the execution plan change if there is an index on column X?”. Installation and setup instructions are part of its official documentation
Mongo_fdw presents collections from mongodb as tables in PostgreSQL. This is a case where the NoSQL world meets the SQL world and features combine. We will be covering this extension in a future blog post. The official repository is available here
Another important FDW (foreign data wrapper) extension in the PostgreSQL world is tds_fdw. Both Microsoft SQL Server and Sybase uses TDS (Tabular Data Stream) format. This fdw allows PostgreSQL to use tables stored in remote SQL Server or Sybase database as local tables. This FDW make use of FreeTDS libraries.
As previously mentioned, there are lot of migrations underway from Oracle to PostgreSQL. Incompatible functions in PostgreSQL are often painful for those who are migrating server code. The “orafce” project implements some of the functions from the Oracle database. The functionality was verified on Oracle 10g and the module is useful for production work. Please refer to the list in its official documentation about the Oracle functions implemented in PostgreSQL
In this new world of IOT and connected devices, there is a growing need of time-series data. Timescale can convert PostgreSQL into a scalable time-series data store. The official site is available here with all relevant links.
Is loading a large volume of data into database in a very efficient and faster way a challenge for you? If so pg_bulkload may help you solve that problem. Official documentation is available here
PostgreSQL 10 introduced declarative partitions. But creating new partitions and maintaining existing ones, including purging unwanted partitions, requires a good dose of manual effort. If you are looking to automate part of this maintenance you should have a look at what pg_partman offers. The repository with documentation is available here.
PostgreSQL has feature related to logical replication built-in. Extra information is recorded in WALs which will facilitate logical decoding. wal2json is a popular output plugin for logical decoding. This can be utilized for different purposes including change data capture. In addition to wal2json, there are other output plugins: a concise list is available in the PostgreSQL wiki.
There are many more extensions that help us build an enterprise-grade PostgreSQL set up using open source solutions. Please feel free to comment and ask us if we know about one that satisfies your particular needs. Or, if there’s still time, sign up for our October webinar and ask us in person!