Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Using plpgsql_check to Find Compilation Errors and Profile Functions

July 30, 2019

Author

Avinash Vallarapu

PostgreSQL

Share this Post:

There is always a need for profiling tools in databases for admins or developers. While it is easy to understand the point where an SQL is spending more time using EXPLAIN or EXPLAIN ANALYZE in PostgreSQL, the same would not work for functions. Recently, Jobin has published a blog post where he demonstrated how plprofiler can be useful in profiling functions. plprofiler builds call graphs and creates flame graphs which make the report very easy to understand. Similarly, there is another interesting project called plpgsql_check which can be used for a similar purpose as plprofiler, while it also looks at code and points out compilation errors. Let us see all of that in action, in this blog post.

Installing plpgsql-check

You could use yum on RedHat/CentOS to install this extension from PGDG repository. Steps to perform source installation on Ubuntu/Debian are also mentioned in the following logs.

On RedHat/CentOS

$ sudo yum install plpgsql_check_11

1	$ sudo yum install plpgsql_check_11

On Ubuntu/Debian

$ sudo apt-get install postgresql-server-dev-11 libicu-dev gcc make 
$ git clone https://github.com/okbob/plpgsql_check.git 
$ cd plpgsql_check/ 
$ make && make install

$ sudo apt-get install postgresql-server-dev-11 libicu-dev gcc make

$ git clone https://github.com/okbob/plpgsql_check.git

$ cd plpgsql_check/

$ make && make install

Creating and enabling this extension

There are 3 advantages of using plpgsql_check

1. Checking for compilation errors in a function code

1. Finding dependencies in functions

1. Profiling functions

When using plpgsql_check for the first 2 requirements, you may not need to add any entry to shared_preload_libraries. However, if you need to use it for profiling functions (3), then you should add appropriate entries to shared_preload_libraries so that it could load both plpgsql and plpgsql_check. Due to dependencies, plpgsql must be before plpgsql_check in the shared_preload_libraries config as you see in the following example :

shared_preload_libraries = plpgsql, plpgsql_check

1	shared_preload_libraries = plpgsql, plpgsql_check

Any change to shared_preload_libraries requires a restart. You may see the following error when you do not have plpgsql before plpgsql_check in the shared_preload_libraries config.

$ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf
shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql_check'

$ pg_ctl -D /var/lib/pgsql/11/data restart -mf
waiting for server to shut down.... done
server stopped
waiting for server to start....2019-07-07 02:07:10.104 EDT [6423] 
FATAL: could not load library "/usr/pgsql-11/lib/plpgsql_check.so": /usr/pgsql-11/lib/plpgsql_check.so: undefined symbol: plpgsql_parser_setup
2019-07-07 02:07:10.104 EDT [6423] LOG: database system is shut down
stopped waiting
pg_ctl: could not start server
Examine the log output.

$ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf
shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql, plpgsql_check'

$ pg_ctl -D /var/lib/pgsql/11/data start
.....
server started

$ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf

shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql_check'

$ pg_ctl -D /var/lib/pgsql/11/data restart -mf

waiting for server to shut down.... done

server stopped

waiting for server to start....2019-07-07 02:07:10.104 EDT [6423]

FATAL: could not load library "/usr/pgsql-11/lib/plpgsql_check.so": /usr/pgsql-11/lib/plpgsql_check.so: undefined symbol: plpgsql_parser_setup

2019-07-07 02:07:10.104 EDT [6423] LOG: database system is shut down

stopped waiting

pg_ctl: could not start server

Examine the log output.

$ grep "shared_preload_libraries" $PGDATA/postgresql.auto.conf

shared_preload_libraries = 'pg_qualstats, pg_stat_statements, plpgsql, plpgsql_check'

$ pg_ctl -D /var/lib/pgsql/11/data start

.....

server started

Once the PostgreSQL instance is started, create this extension in the database where you need to perform any of the previously discussed 3 tasks.

$ psql -d percona -c "CREATE EXTENSION plpgsql_check"
CREATE EXTENSION

1 2	$ psql -d percona -c "CREATE EXTENSION plpgsql_check" CREATE EXTENSION

Finding Compilation Errors

As discussed earlier, this extension can help developers and admins determine compilation errors. But why is it needed? Let’s consider the following example where we see no errors while creating the function. By the way, I have taken this example from my previous blog post where I was talking about Automatic Index recommendations using hypopg and pg_qualstats. You might want to read that blog post to understand the logic of the following function.

percona=# CREATE OR REPLACE FUNCTION find_usable_indexes()
percona-# RETURNS VOID AS
percona-# $$
percona$# DECLARE
percona$#     l_queries     record;
percona$#     l_querytext     text;
percona$#     l_idx_def       text;
percona$#     l_bef_exp       text;
percona$#     l_after_exp     text;
percona$#     hypo_idx      record;
percona$#     l_attr        record;
percona$#     /* l_err       int; */
percona$# BEGIN
percona$#     CREATE TABLE IF NOT EXISTS public.idx_recommendations (queryid bigint, 
percona$#     query text, current_plan jsonb, recmnded_index text, hypo_plan jsonb);
percona$#     FOR l_queries IN
percona$#     SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums, 
percona$#     pg_qualstats_example_query(t.queryid) as query
percona$#       FROM 
percona$#         ( 
percona$#          SELECT qs.relid::regclass AS relname, qs.relid AS relid, qs.queryid, 
percona$#          string_agg(DISTINCT attnames.attnames,',') AS attnames, qs.attnums
percona$#          FROM pg_qualstats_all qs
percona$#          JOIN pg_qualstats q ON q.queryid = qs.queryid
percona$#          JOIN pg_stat_statements ps ON q.queryid = ps.queryid
percona$#          JOIN pg_amop amop ON amop.amopopr = qs.opno
percona$#          JOIN pg_am ON amop.amopmethod = pg_am.oid,
percona$#          LATERAL 
percona$#               ( 
percona$#                SELECT pg_attribute.attname AS attnames
percona$#                FROM pg_attribute
percona$#                JOIN unnest(qs.attnums) a(a) ON a.a = pg_attribute.attnum 
percona$#                AND pg_attribute.attrelid = qs.relid
percona$#                ORDER BY pg_attribute.attnum) attnames,     
percona$#          LATERAL unnest(qs.attnums) attnum(attnum)
percona$#                WHERE NOT 
percona$#                (
percona$#                 EXISTS 
percona$#                       ( 
percona$#                        SELECT 1
percona$#                        FROM pg_index i
percona$#                        WHERE i.indrelid = qs.relid AND 
percona$#                        (arraycontains((i.indkey::integer[])[0:array_length(qs.attnums, 1) - 1], 
percona$#                         qs.attnums::integer[]) OR arraycontains(qs.attnums::integer[], 
percona$#                         (i.indkey::integer[])[0:array_length(i.indkey, 1) + 1]) AND i.indisunique)))
percona$#                        GROUP BY qs.relid, qs.queryid, qs.qualnodeid, qs.attnums) t
percona$#                        GROUP BY t.relid, t.relname, t.queryid, t.attnames, t.attnums                   
percona$#     LOOP
percona$#         /* RAISE NOTICE '% : is queryid',l_queries.queryid; */
percona$#         execute 'explain (FORMAT JSON) '||l_queries.query INTO l_bef_exp;
percona$#         execute 'select hypopg_reset()';
percona$#         execute 'SELECT indexrelid,indexname FROM hypopg_create_index(''CREATE INDEX on '||l_queries.relname||'('||l_queries.attnames||')'')' INTO hypo_idx;      
percona$#         execute 'explain (FORMAT JSON) '||l_queries.query INTO l_after_exp;
percona$#         execute 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')' INTO l_idx_def;
percona$#         INSERT INTO public.idx_recommendations (queryid,query,current_plan,recmnded_index,hypo_plan) 
percona$#         VALUES (l_queries.queryid,l_querytext,l_bef_exp::jsonb,l_idx_def,l_after_exp::jsonb);        
percona$#     END LOOP;    
percona$#         execute 'select hypopg_reset()';
percona$# END;
percona$# $$ LANGUAGE plpgsql;
CREATE FUNCTION

percona=# CREATE OR REPLACE FUNCTION find_usable_indexes()

percona-# RETURNS VOID AS

percona-# $$

percona$# DECLARE

percona$# l_queries record;

percona$# l_querytext text;

percona$# l_idx_def text;

percona$# l_bef_exp text;

percona$# l_after_exp text;

percona$# hypo_idx record;

percona$# l_attr record;

percona$# /* l_err int; */

percona$# BEGIN

percona$# CREATE TABLE IF NOT EXISTS public.idx_recommendations (queryid bigint,

percona$# query text, current_plan jsonb, recmnded_index text, hypo_plan jsonb);

percona$# FOR l_queries IN

percona$# SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums,

percona$# pg_qualstats_example_query(t.queryid) as query

percona$# FROM

percona$# (

percona$# SELECT qs.relid::regclass AS relname, qs.relid AS relid, qs.queryid,

percona$# string_agg(DISTINCT attnames.attnames,',') AS attnames, qs.attnums

percona$# FROM pg_qualstats_all qs

percona$# JOIN pg_qualstats q ON q.queryid = qs.queryid

percona$# JOIN pg_stat_statements ps ON q.queryid = ps.queryid

percona$# JOIN pg_amop amop ON amop.amopopr = qs.opno

percona$# JOIN pg_am ON amop.amopmethod = pg_am.oid,

percona$# LATERAL

percona$# (

percona$# SELECT pg_attribute.attname AS attnames

percona$# FROM pg_attribute

percona$# JOIN unnest(qs.attnums) a(a) ON a.a = pg_attribute.attnum

percona$# AND pg_attribute.attrelid = qs.relid

percona$# ORDER BY pg_attribute.attnum) attnames,

percona$# LATERAL unnest(qs.attnums) attnum(attnum)

percona$# WHERE NOT

percona$# (

percona$# EXISTS

percona$# (

percona$# SELECT 1

percona$# FROM pg_index i

percona$# WHERE i.indrelid = qs.relid AND

percona$# (arraycontains((i.indkey::integer[])[0:array_length(qs.attnums, 1) - 1],

percona$# qs.attnums::integer[]) OR arraycontains(qs.attnums::integer[],

percona$# (i.indkey::integer[])[0:array_length(i.indkey, 1) + 1]) AND i.indisunique)))

percona$# GROUP BY qs.relid, qs.queryid, qs.qualnodeid, qs.attnums) t

percona$# GROUP BY t.relid, t.relname, t.queryid, t.attnames, t.attnums

percona$# LOOP

percona$# /* RAISE NOTICE '% : is queryid',l_queries.queryid; */

percona$# execute 'explain (FORMAT JSON) '||l_queries.query INTO l_bef_exp;

percona$# execute 'select hypopg_reset()';

percona$# execute 'SELECT indexrelid,indexname FROM hypopg_create_index(''CREATE INDEX on '||l_queries.relname||'('||l_queries.attnames||')'')' INTO hypo_idx;

percona$# execute 'explain (FORMAT JSON) '||l_queries.query INTO l_after_exp;

percona$# execute 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')' INTO l_idx_def;

percona$# INSERT INTO public.idx_recommendations (queryid,query,current_plan,recmnded_index,hypo_plan)

percona$# VALUES (l_queries.queryid,l_querytext,l_bef_exp::jsonb,l_idx_def,l_after_exp::jsonb);

percona$# END LOOP;

percona$# execute 'select hypopg_reset()';

percona$# END;

percona$# $$ LANGUAGE plpgsql;

CREATE FUNCTION

From the above log, it has created the function with no errors. Unless we call the above function, we do not know if it has any compilation errors. Surprisingly, with this extension, we can use the plpgsql_check_function_tb() function to learn whether there are any errors, without actually calling it.

percona=# SELECT functionid, lineno, statement, sqlstate, message, detail, hint, level, position, 
left (query,60),context FROM plpgsql_check_function_tb('find_usable_indexes()');
-[ RECORD 1 ]------------------------------------------------------------
functionid | find_usable_indexes
lineno     | 14
statement  | FOR over SELECT rows
sqlstate   | 42P01
message    | relation "pg_qualstats_all" does not exist
detail     | 
hint       | 
level      | error
position   | 306
left       | SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums,
context    |

percona=# SELECT functionid, lineno, statement, sqlstate, message, detail, hint, level, position,

left (query,60),context FROM plpgsql_check_function_tb('find_usable_indexes()');

-[ RECORD 1 ]------------------------------------------------------------

functionid | find_usable_indexes

lineno | 14

statement | FOR over SELECT rows

sqlstate | 42P01

message | relation "pg_qualstats_all" does not exist

detail |

hint |

level | error

position | 306

left | SELECT t.relid, t.relname, t.queryid, t.attnames, t.attnums,

context |

From the above log, it is clear that there is an error where a relation used in the function does not exist. But, if we are using dynamic SQLs that are assembled in runtime, false positives are possible, as you can see in the following example. So, you may avoid the functions using dynamic SQL’s or comment the code containing those SQLs before checking for other compilation errors.

percona=# select * from plpgsql_check_function_tb('find_usable_indexes()');
-[ RECORD 1 ]------------------------------------------------------------------------------
functionid | find_usable_indexes
lineno     | 50
statement  | EXECUTE
sqlstate   | 00000
message    | cannot determinate a result of dynamic SQL
detail     | There is a risk of related false alarms.
hint       | Don't use dynamic SQL and record type together, when you would check function.
level      | warning
position   | 
query      | 
context    | 
-[ RECORD 2 ]------------------------------------------------------------------------------
functionid | find_usable_indexes
lineno     | 52
statement  | EXECUTE
sqlstate   | 55000
message    | record "hypo_idx" is not assigned yet
detail     | The tuple structure of a not-yet-assigned record is indeterminate.
hint       | 
level      | error
position   | 
query      | 
context    | SQL statement "SELECT 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')'"

percona=# select * from plpgsql_check_function_tb('find_usable_indexes()');

-[ RECORD 1 ]------------------------------------------------------------------------------

functionid | find_usable_indexes

lineno | 50

statement | EXECUTE

sqlstate | 00000

message | cannot determinate a result of dynamic SQL

detail | There is a risk of related false alarms.

hint | Don't use dynamic SQL and record type together, when you would check function.

level | warning

position |

query |

context |

-[ RECORD 2 ]------------------------------------------------------------------------------

functionid | find_usable_indexes

lineno | 52

statement | EXECUTE

sqlstate | 55000

message | record "hypo_idx" is not assigned yet

detail | The tuple structure of a not-yet-assigned record is indeterminate.

hint |

level | error

position |

query |

context | SQL statement "SELECT 'select hypopg_get_indexdef('||hypo_idx.indexrelid||')'"

Finding Dependencies

This extension can be used to find dependent objects in a function. This way, it becomes easy for administrators to understand the objects being used in a function without spending a huge amount of time reading the code. The trick is to simply pass your function as a parameter to plpgsql_show_dependency_tb() as you see in the following example.

percona=# select * from plpgsql_show_dependency_tb('find_usable_indexes()');
   type   |  oid  |   schema   |            name            |  params   
----------+-------+------------+----------------------------+-----------
 FUNCTION | 16547 | public     | pg_qualstats               | ()
 FUNCTION | 16545 | public     | pg_qualstats_example_query | (bigint)
 FUNCTION | 16588 | public     | pg_stat_statements         | (boolean)
 RELATION |  2601 | pg_catalog | pg_am                      | 
 RELATION |  2602 | pg_catalog | pg_amop                    | 
 RELATION |  1249 | pg_catalog | pg_attribute               | 
 RELATION |  1262 | pg_catalog | pg_database                | 
 RELATION |  2610 | pg_catalog | pg_index                   | 
 RELATION | 16480 | public     | idx_recommendations        | 
 RELATION | 16549 | public     | pg_qualstats               | 
 RELATION | 16559 | public     | pg_qualstats_all           | 
 RELATION | 16589 | public     | pg_stat_statements         | 
(12 rows)

percona=# select * from plpgsql_show_dependency_tb('find_usable_indexes()');

type | oid | schema | name | params

----------+-------+------------+----------------------------+-----------

FUNCTION | 16547 | public | pg_qualstats | ()

FUNCTION | 16545 | public | pg_qualstats_example_query | (bigint)

FUNCTION | 16588 | public | pg_stat_statements | (boolean)

RELATION | 2601 | pg_catalog | pg_am |

RELATION | 2602 | pg_catalog | pg_amop |

RELATION | 1249 | pg_catalog | pg_attribute |

RELATION | 1262 | pg_catalog | pg_database |

RELATION | 2610 | pg_catalog | pg_index |

RELATION | 16480 | public | idx_recommendations |

RELATION | 16549 | public | pg_qualstats |

RELATION | 16559 | public | pg_qualstats_all |

RELATION | 16589 | public | pg_stat_statements |

(12 rows)

Profiling Functions

This is one of the very interesting features. Once you have added the appropriate entries to shared_preload_libraries as discussed earlier, you could easily enable or disable profiling through a GUC: plpgsql_check.profiler. This parameter can either be set globally or for only your session. Here’s an example to understand it better.

percona=# ALTER SYSTEM SET plpgsql_check.profiler TO 'ON';
ALTER SYSTEM
percona=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

percona=# ALTER SYSTEM SET plpgsql_check.profiler TO 'ON';

ALTER SYSTEM

percona=# select pg_reload_conf();

pg_reload_conf

----------------

(1 row)

When you set it globally, all the functions running in the database are automatically profiled and stored. This may be fine in a development or a testing environment, but you could also enable profiling of functions called in a single session through a session-level setting as you see in the following example.

percona=# BEGIN;
BEGIN
percona=# SET plpgsql_check.profiler TO 'ON';
SET
percona=# select salary_update(1000);
 salary_update 
---------------
 
(1 row)

percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)');
 lineno | avg_time |                                                    source                                                     
--------+----------+---------------------------------------------------------------------------------------------------------------
      1 |          | 
      2 |          | DECLARE 
      3 |          | l_abc record;
      4 |          | new_sal     INT;
      5 |    0.005 | BEGIN
      6 |   144.96 |   FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2'
      7 |          |   LOOP
      8 |      NaN |       new_sal := l_abc.salary + sal_raise;
      9 |      NaN |       UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id;
     10 |          |   END LOOP;
     11 |          | END;
(11 rows)

--- Create an Index and check if it improves the execution time of FOR loop.

percona=# CREATE INDEX idx_1 ON employee (emp_id, dept_id);
CREATE INDEX
percona=# select salary_update(1000);
 salary_update 
---------------
 
(1 row)

percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)');
 lineno | avg_time |                                                    source                                                     
--------+----------+---------------------------------------------------------------------------------------------------------------
      1 |          | 
      2 |          | DECLARE 
      3 |          | l_abc record;
      4 |          | new_sal     INT;
      5 |    0.007 | BEGIN
      6 |   73.074 |   FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2'
      7 |          |   LOOP
      8 |      NaN |       new_sal := l_abc.salary + sal_raise;
      9 |      NaN |       UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id;
     10 |          |   END LOOP;
     11 |          | END;
(11 rows)

percona=# END;
COMMIT
percona=# show plpgsql_check.profiler;
 plpgsql_check.profiler 
------------------------
 on
(1 row)

percona=# BEGIN;

BEGIN

percona=# SET plpgsql_check.profiler TO 'ON';

SET

percona=# select salary_update(1000);

salary_update

---------------

(1 row)

percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)');

lineno | avg_time | source

--------+----------+---------------------------------------------------------------------------------------------------------------

1 | |

2 | | DECLARE

3 | | l_abc record;

4 | | new_sal INT;

5 | 0.005 | BEGIN

6 | 144.96 | FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2'

7 | | LOOP

8 | NaN | new_sal := l_abc.salary + sal_raise;

9 | NaN | UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id;

10 | | END LOOP;

11 | | END;

(11 rows)

--- Create an Index and check if it improves the execution time of FOR loop.

percona=# CREATE INDEX idx_1 ON employee (emp_id, dept_id);

CREATE INDEX

percona=# select salary_update(1000);

salary_update

---------------

(1 row)

percona=# select lineno, avg_time, source from plpgsql_profiler_function_tb('salary_update(int)');

lineno | avg_time | source

--------+----------+---------------------------------------------------------------------------------------------------------------

1 | |

2 | | DECLARE

3 | | l_abc record;

4 | | new_sal INT;

5 | 0.007 | BEGIN

6 | 73.074 | FOR l_abc IN EXECUTE 'SELECT emp_id, salary FROM employee where emp_id between 1 and 10000 and dept_id = 2'

7 | | LOOP

8 | NaN | new_sal := l_abc.salary + sal_raise;

9 | NaN | UPDATE employee SET salary = new_sal WHERE emp_id = l_abc.emp_id;

10 | | END LOOP;

11 | | END;

(11 rows)

percona=# END;

COMMIT

percona=# show plpgsql_check.profiler;

plpgsql_check.profiler

------------------------

(1 row)

In the above log, I have opened a new transaction block and enabled the parameter plpgsql_check.profiler only for that block. So any function that I have executed in that transaction is profiled, which can be extracted using plpgsql_profiler_function_tb(). Once we have identified the area where the execution time is high, the immediate action is to tune that piece of code. After creating the index, I have called the function again. It has now performed faster than earlier.

Conclusion

Special thanks to Pavel Stehule who has authored this extension and also to the contributors who have put this extension into a usable stage today. This is one of the simplest extensions that can be used to check for compilation errors and dependencies. While this can also be a handy profiling tool, a developer may find both plprofiler or plpgsql_check helpful for profiling as well.

0 0 votes

Article Rating

2 Comments

Oldest

Newest Most Voted

okbobcz

6 years ago

Just note from plpgsql_check author – a) thank you for your article, b) profiler needs shared memory only when profile should be persistent (over sessions). If you don’t need it (you can use profiler just in one session), then preloading is not necessary.

vikram kumar

6 years ago

Hi Sir,
How to use plpgsql_check extension in postgres advanced server (EnterpriseDB )? as we don’t mention language during creating package .
Please suggest can we use this extension to check user defined edb packages?