In database management, job scheduling has always played an important role. And for PostgreSQL, PgAgent and pg_cron are examples of job schedulers that are already popular. However, there is another job scheduler called pg_timetable, which is completely database-driven and provides a couple of advanced concepts. In this blog, we are going to highlight some of the main features, installation, and some use cases of pg_timetable.
There are currently two options on how you can install and run pg_timetable.
2.1. Download and install GO on your system.
2.2. Clone pg_timetable repo
|
1 |
$ git clone https://github.com/cybertec-postgresql/pg_timetable.git<br> $ cd pg_timetable |
2.3. Run pg_timetable
|
1 |
$ go run main.go --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword |
2.4. Alternatively, build a binary and run it:
|
1 |
$ go build<br> $ ./pg_timetable --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword<br> |
As a use case here, it will be shown how to use pg_timetable as a scheduler to schedule a job, which will refresh the materialized view every day at 12 midnight.
1. Download pg_timetable executable (Follow step 2 mentioned above in the installation section).
2. Make sure the PostgreSQL server is up and running and has a role with CREATE privilege for a target database, e.g.:
|
1 |
postgres=# CREATE ROLE scheduler PASSWORD '***********';<br>postgres=# GRANT CREATE ON DATABASE postgres TO scheduler;<br><br>postgres=# CREATE TABLE t_demo (grp int, data numeric);<br>CREATE TABLE<br>postgres=# INSERT INTO t_demo SELECT 1, random()<br>FROM generate_series(1, 5000000);<br>INSERT 0 5000000<br><br><br>postgres=#CREATE MATERIALIZED VIEW mat_view AS<br>SELECT grp, avg(data), count(*)<br>FROM t_demo<br>GROUP BY 1;<br>SELECT 1<br><br>postgres=# ALTER MATERIALIZED VIEW mat_view OWNER TO scheduler;<br>postgres=# GRANT SELECT ON mat_view TO scheduler;<br><br><br>-bash-4.2$ psql<br>psql (12.16)<br>Type "help" for help.<br><br><br>postgres=# SELECT * FROM mat_view;<br>grp | avg | count<br>-----+--------------------------+---------<br>1 | 0.5001807958659610956005 | 5000000<br>(1 row)<br><br>postgres=# INSERT INTO t_demo SELECT 2, random()<br>postgres-# FROM generate_series(1, 5000000);<br>INSERT 0 5000000 |
3. Create a new job to refresh the materialized view each night at 12:00 Postgres server time zone.
|
1 |
postgres=# SELECT timetable.add_job('refresh-matview', '0 12 * * *', 'REFRESH MATERIALIZED VIEW public.mat_view');<br>add_job<br>---------<br>1<br>(1 row) |
4. Run the pg_timetable
|
1 |
[centos@ip-172-31-32-10 pg_timetable]$ ./pg_timetable --dbname=postgres --clientname=worker001 --user=scheduler --password=********<br>2023-09-09 11:59:20.929 [INFO] [sid:697146069] Starting new session...<br>2023-09-09 11:59:20.941 [INFO] Database connection established<br>2023-09-10 12:00:00.961 [INFO] Accepting asynchronous chains execution requests...<br>2023-09-10 12:00:00.970 [INFO] [count:0] Retrieve scheduled chains to run @reboot<br>2023-09-10 12:00:00.991 [INFO] [count:3] Retrieve scheduled chains to run<br>2023-09-10 12:00:00.994 [INFO] [count:0] Retrieve interval chains to run<br>2023-09-10 12:00:00.019 [INFO] [chain:1] Starting chain<br>2023-09-10 12:00:00.722 [INFO] [chain:1] [task:1] [txid:2613] Starting task<br>2023-09-10 12:00:00.074 [INFO] [chain:1] [task:1] [txid:2613] Starting task<br>2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Closing remote session<br>2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Task executed successfully<br>2023-09-10 12:00:00.185 [INFO] [chain:1] [txid:2613] Chain executed successfully |
During the first start of pg_timetable, the necessary schema timetable gets created. For reference, below is the catalog structure.
|
1 |
postgres=# dn<br>List of schemas<br>Name | Owner<br>-----------+-----------<br>public | postgres<br>timetable | scheduler<br>(2 rows)<br><br>postgres=# set search_path to timetable ;<br>SET<br>postgres=# dt<br>List of relations<br>Schema | Name | Type | Owner<br>-----------+----------------+-------+-----------<br>timetable | active_chain | table | scheduler<br>timetable | active_session | table | scheduler<br>timetable | chain | table | scheduler<br>timetable | execution_log | table | scheduler<br>timetable | log | table | scheduler<br>timetable | migration | table | scheduler<br>timetable | parameter | table | scheduler<br>timetable | task | table | scheduler<br>(8 rows) |
5. From database logs, it can be observed that MATERIALIZED VIEW gets refreshed as per the schedule.
Output from DB logs:
|
1 |
2023-09-10 12:00:00 UTC [14334] LOG: statement: REFRESH MATERIALIZED VIEW public.mat_view |
From psql prompt:
|
1 |
-bash-4.2$ psql<br>psql (12.16)<br>Type "help" for help.<br><br>postgres=# SELECT * FROM mat_view;<br>grp | avg | count<br>-----+---------------------------+---------<br>1 | 0.5001807958659610956005 | 5000000<br>2 | 0.50000009110202547559215 | 5000000<br>(2 rows) |
Below is the output from the pg_timetable catalog tables.
|
1 |
postgres=# select * from active_session ;<br>client_pid | server_pid | client_name | started_at<br>------------+------------+-------------+------------------------------<br>697146069 | 20137 | worker001 | 2023-09-10 11:59:205672+00<br>(1 row) |
|
1 |
postgres=# select * from chain;<br>-[ RECORD 1 ]-------+--------------------<br>chain_id | 1<br>chain_name | refresh-matview<br>run_at | 0 12 * * *<br>max_instances |<br>timeout | 0<br>live | t<br>self_destruct | f<br>exclusive_execution | f<br>client_name |<br>on_error | |
|
1 |
postgres=# select * from execution_log where chain_id=1;<br>-[ RECORD 01 ]-------------------------------------------------------------------<br>chain_id | 1<br>task_id | 1<br>txid | 2613<br>last_run | 2023-09-10 12:00:00.137404+00<br>finished | 2023-09-10 12:00:00.586543+00<br>pid | 697146069<br>returncode | 0<br>kind | SQL<br>command | REFRESH MATERIALIZED VIEW public.mat_view<br>output | REFRESH MATERIALIZED VIEW<br>client_name | worker001 |
In conclusion, we can say that pg_timetable is open source and can be used freely by everyone. The main advantages include that pg_timetable is an independent process written in GO, which connects to PostgreSQL just like any other client program. So, if the scheduler crashes, it will not harm your server. Pg_timetable provides a variety of built-in tasks that help you to flexibly combine those operations in an easy way. Further, pg_timetable has been implemented in GO and, therefore, comes as ONE executable that can be started directly. So, there is no need to worry about libraries or dependencies during the installation.
Please refer to the links below to learn more about pg_timetable.
https://github.com/cybertec-postgresql/pg_timetable
https://www.cybertec-postgresql.com/en/products/pg_timetable/
https://pg-timetable.readthedocs.io/en/master/README.html
