Writing PostgreSQL Extensions is Fun – C Language

PostgreSQL is a powerful open source relational database management system. It extends the SQL language with additional features. A DBMS is not only defined by its performance and out of the box features, but also its ability to support bespoke/additional user-specific functionality. Some of these functionalities may be in the form of database constructs or modules, like stored procedures or functions, but their scope is generally limited to the functionality being exposed by the DBMS. For instance, how will you write a custom query-analyzing application that resides within your DBMS?

To support such options, PostgreSQL provides a pluggable architecture that allows you to install extensions. Extensions may consist of a configuration (control) file, a combination of SQL files, and dynamically loadable libraries.

This means you can write your own code as per the defined guidelines of an extension and plug it in a PostgreSQL instance without changing the actual PostgreSQL code tree. An extension by very definition extends what PostgreSQL can do, but more than that, it gives you the ability to interact with external entities. These external entities can be other database management systems like ClickHouse, Mongo or HDFs (normally these are called foreign data wrappers), or other interpreters or compilers (thus allowing us to write database functions in another language like Java, Python, Perl or TCL, etc.). Another potential use case of an extension can be for code obfuscation which allows you to protect your super secret code from prying eyes.

Build your own

To build your own extension, you don’t need a complete PostgreSQL code base. You can build and install an extension using installed PostgreSQL (it may require you to install a devel RPM or Debian package). Details about extensions can be found in PostgreSQL’s official documentation[1]. There many extensions available for different features in the contrib directory of PostgreSQL source. Other than the contrib directory, people are also writing extensions readily available on the internet but currently not part of the PostgreSQL source tree. The pg_stat_statements, PL/pgSQL, and PostGIS are examples of the best known or most widely used extensions.

Generally available PostgreSQL extensions may be classified into four main categories:

  • Add support of a new language extension (PL/pgSQL, PL/Python, and PL/Java)
  • Data type extensions where you can introduce new ((Hstore, cube, and hstore)
  • Miscellaneous extensions (contrib folder has many miscellaneous extensions)
  • Foreign Data Wrapper extension (postgres_fdw, mysqldb_fdw, clickhousedb_fdw)

There are four basic file types that are required for building an extension:

  • Makefile: Which uses PGXS PostgreSQL’s build infrastructure for extensions.
  • Control File: Carries information about the extension.
  • SQL File(s): If the extension has any SQL code, it may reside in form SQL files (optional)
  • C Code: The shared object that we want to build (optional).

Extension Makefile

To compile the C code, we need a makefile. It’s a very simple makefile with the exception of “PGXS”, which is PostgreSQL’s infrastructural makefile for creating extensions. The inclusion of “PGXS” is done by invoking pg_config binary with “–pgxs” flag. The detail of that file can be found at GitHub[2].

This is a sample makefile, which can be used to compile the C code.

Extension Control File

This file must be named as [EXTENSION NAME]. control file. The control file can contain many options which can be found at official documentation [3]. But in this example, I have used some basic options.

comments: Comments about the extension.

default_version: This is the extension SQL version. The name of the SQL file contains this information in the file name.

relocatable:  Tells PostgreSQL if it is possible to move contained objects into a different schema.

module_pathname:  This information is replaced with the actual lib file path.

The contents of the file can be seen using the psql command \dx psql.

Extension SQL File

This is a mapping file, which I used to map the PostgreSQL function with the corresponding C function. Whenever you call the SQL function then the corresponding C function is called. The name of the file must be [EXTENSION NAME]–[default-version].sql. This is the same default_version as defined in the control file.

Extension C Code

There are three kinds of functions you can write in c code.

The first is where you call your c code function using SQL function written in SQL file of the extension.

The second type of function is callbacks. You register that callback by assigning the pointer of the function. There is no need for an SQL function here. You call this function automatically when a specific event occurs.

The third type of function is called automatically, even without registering. These functions are called on events such as extension load/unload time etc.

This is the C file containing the definition of the C code. There is no restriction for the name, or of the number of C files.

You need to include postgres.h for the extension. You can include other PostgreSQL as needed. PG_MODULE_MAGIC is macro which you need to include in the C file for the extension. _PG_init and _PG_fini are functions that are called when the extension is loaded or unloaded, respectively.

Here is an example of the load and unload functions of an extension.

Here is an example of a callback function that you can invoke whenever you call a utility statement e.g. any DDL statement. The “queryString” variable contains the actual query text.

Finally, here’s an example of a C function that is invoked through a user-defined SQL function. This internally calls the C function contained in our shared object.

Compile and Install

Before compilation, you need to set the PATH for PostgreSQL’s bin directory if there is no pg_config available in the system paths.

Output

We can now use our extension through a simple SQL query. Following is the output that is coming straight out of extension written in C programming language.

I hope this example can serve as a starting point for you to create more useful extensions that will not only help you and your company, but will also give you an opportunity to share and help the PostgreSQL community grow.

The complete example can be found at Github[4].

[1]: https://www.postgresql.org/docs/current/external-extensions.html

[2]: https://github.com/postgres/postgres/blob/master/src/makefiles/pgxs.mk

[3]: https://www.postgresql.org/docs/9.1/extend-extensions.html

[4]: https://github.com/ibrarahmad/Blog-Examples/tree/master/log


Photo from Pexels

Share this post

Comments (5)

  • RhodiumToad (@RhodiumToad) Reply

    Note that _PG_fini is never actually called; there is no provision for unloading modules. (Way back in the day there used to be, but it was removed as unsafe – the code is still there in the backend but #ifdef’d out.) So it’s best not to mention it in examples.

    If a module has a _PG_init, then also bear in mind that it won’t normally be loaded until the first use of (or definition of) some function defined in the module, unless you also add it to an appropriate _preload_libraries config setting.

    April 5, 2019 at 8:51 am
    • Ibrar Ahmed Reply

      I think you are right about _PG_fini, but it is still there in the code. But _PG_init called in case of CREATE EXTENSION too.

      Example:
      postgres=# create extension log;
      2019-04-05 13:05:42.572 UTC [17456] WARNING: _PG_init
      psql: WARNING: _PG_init
      CREATE EXTENSION

      April 5, 2019 at 9:06 am
      • Ibrar Ahmed Reply

        I have not explained the _preload_libraries too. I have also not explained the complete loading concept of the module. As this is the basic blog about extesnion, so lets keep that simple.

        April 5, 2019 at 9:10 am
      • RhodiumToad (@RhodiumToad) Reply

        As I said above, defining a function that’s in some module does also load the module (in that session, only). That’s why CREATE EXTENSION ends up loading it, which calls _PG_init. If you disconnect and reconnect after doing the CREATE, you’ll find that the extension exists but that the module is not loaded until you call the function.

        April 5, 2019 at 10:53 am
        • Ibrar Ahmed Reply

          Yes, thats right, I am not disagreeing that.

          April 5, 2019 at 10:56 am

Leave a Reply