In PostgreSQL, an “orphan large object” refers to a large object (LOB) that exists in the pg_largeobject system table but is not associated with any specific row in a user-defined table. This situation can occur due to various reasons, such as when the large object was created, but its reference in the user table was subsequently deleted without properly removing the large object itself.
Identifying orphan large objects
Orphan large objects in PostgreSQL can be identified by examining entries in the pg_largeobject system table. The object identifier (loid) does not have a corresponding reference in any user-defined tables that typically store or reference large objects. This discrepancy indicates that the large object exists in the database system but is not linked to any specific data in user tables where it should logically reside.
1 2 3 4 5 6 7 8 9 |
postgres=# d pg_catalog.pg_largeobject Table "pg_catalog.pg_largeobject" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- loid | oid | | not null | pageno | integer | | not null | data | bytea | | not null | Indexes: "pg_largeobject_loid_pn_index" UNIQUE, btree (loid, pageno) |
Removing orphan large objects
In PostgreSQL, the management of large objects does not include automatic cleanup. Database administrators (DBAs) are responsible for manually removing orphaned large objects, ensuring efficient storage utilization and database performance. This article will guide you through the process of identifying and removing orphaned large objects, showcasing the use of PostgreSQL’s vacuumlo utility for this specific task.
Let’s create some tables with large objects and populate them with data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
postgres=# create table appointment_acptd (name text, image oid,apt_date date not null); CREATE TABLE postgres=# insert into appointment_acptd values('A.Deb',lo_import ('/tmp/signature.jpeg'),'2024-06-24'); INSERT 0 1 postgres=# insert into appointment_acptd values('S.Patil',lo_import ('/tmp/signature.jpeg'),'2024-06-25'); INSERT 0 1 postgres=# insert into appointment_acptd values('S.Kaliyal',lo_import ('/tmp/signature.jpeg'),'2024-06-26'); INSERT 0 1 postgres=# select * from appointment_acptd; name | image | apt_date -----------+-------+------------ A.Deb | 16461 | 2024-06-24 S.Patil | 16462 | 2024-06-25 S.Kaliyal | 16463 | 2024-06-26 (3 rows) postgres=# create table appointment (name text, image oid, apt_date date not null); CREATE TABLE postgres=# insert into appointment values('A.Danial',lo_import ('/tmp/signature.jpeg'),'2024-06-25'); INSERT 0 1 postgres=# select * from appointment; name | image | apt_date ----------+-------+------------ A.Danial | 16470 | 2024-06-25 (1 rows) postgres=# create table appointment_rej (name text, image oid,apt_date date not null); CREATE TABLE postgres=# insert into appointment_rej values('A.Deb',lo_import ('/tmp/signature.jpeg'),'2024-06-24'); INSERT 0 1 postgres=#insert into appointment_rej values('S.Patil',lo_import ('/tmp/signature.jpeg'),'2024-06-25'); INSERT 0 1 postgres=# insert into appointment_rej values('S.Kaliyal',lo_import ('/tmp/signature.jpeg'),'2024-06-26'); INSERT 0 1 test=# select * from appointment_rej; name | image | apt_date -----------+-------+------------ A.Deb | 16526 | 2024-06-24 S.Patil | 16527 | 2024-06-25 S.Kaliyal | 16528 | 2024-06-2 (3 rows) |
One can identify orphan large objects by performing a join operation between the pg_largeobject system table and all user-defined tables that store large objects. From the below query output it can be seen that there is no orphan larger object as of now.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
postgres=# with t as (select distinct attrelid::regclass::varchar as table_name,attname as column_name, atttypid::regtype as type from pg_attribute a join pg_class c on a.attrelid=c.oid WHERE a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid AND a.atttypid::regtype::varchar in ('oid', 'lo') AND c.relkind in ('r', 'm') AND c.relnamespace::regnamespace::varchar !~ '^pg_' limit 10), t2 as (select string_agg(distinct 'SELECT distinct ' || column_name || ' from ' || table_name,' UNION ') as v_sql from t) SELECT ' select distinct loid from pg_largeobject EXCEPT (' || v_sql || ');' from t2; ?column? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); (1 row) postgres=# select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); loid ------ (0 rows) |
Additionally, PostgreSQL’s vacuumlo utility offers a dry run feature that allows administrators to inspect orphaned large objects within the database without actually performing the cleanup. This functionality enables proactive checking of the database environment to identify orphaned large objects.
1 2 3 4 5 6 7 |
postgres=# ! vacuumlo -v -n postgres Password: Connected to database "postgres" Test run: no large objects will be removed! Checking image in public.appointment_acptd Checking image in public.appointment Would remove 0 large objects from database "postgres". |
Now to intentionally create an orphaned large object, we will drop one table and subsequently recreate it.
1 2 3 4 5 6 |
postgres=# drop table appointment; DROP TABLE postgres=# create table appointment (name text, image oid, apt_date date not null); CREATE TABLE postgres=# insert into appointment values('A.Danial',lo_import ('/tmp/signature.jpeg'),'2024-06-25'); INSERT 0 1 |
Let’s begin by running a dry run with vacuumlo to verify the presence of any orphaned large objects.
1 2 3 4 5 6 |
postgres=# ! vacuumlo -v -n postgres Connected to database "postgres" Test run: no large objects will be removed! Checking image in public.appointment_acptd Checking image in public.appointment Would remove 1 large objects from database "postgres". |
Additionally, we’ll query pg_largeobject to confirm the presence of orphaned large objects.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
postgres=# with t as (select distinct attrelid::regclass::varchar as table_name,attname as column_name, atttypid::regtype as type from pg_attribute a join pg_class c on a.attrelid=c.oid WHERE a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid AND a.atttypid::regtype::varchar in ('oid', 'lo') AND c.relkind in ('r', 'm') AND c.relnamespace::regnamespace::varchar !~ '^pg_' limit 10), t2 as (select string_agg(distinct 'SELECT distinct ' || column_name || ' from ' || table_name,' UNION ') as v_sql from t) SELECT ' select distinct loid from pg_largeobject EXCEPT (' || v_sql || ');' from t2; ?column? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); postgres=# select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); loid ------- 16470 (1 row) |
We have identified one orphaned large object. To resolve this, we need to execute vacuumlo.
1 2 3 4 5 6 |
postgres=# ! vacuumlo -v postgres Password: Connected to database "test" Checking image in public.appointment_acptd Checking image in public.appointment Successfully removed 1 large objects from database "postgres". |
Verify that there are no longer any orphaned objects.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
postgres=# with t as (select distinct attrelid::regclass::varchar as table_name,attname as column_name, atttypid::regtype as type from pg_attribute a join pg_class c on a.attrelid=c.oid WHERE a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid AND a.atttypid::regtype::varchar in ('oid', 'lo') AND c.relkind in ('r', 'm') AND c.relnamespace::regnamespace::varchar !~ '^pg_' limit 10), t2 as (select string_agg(distinct 'SELECT distinct ' || column_name || ' from ' || table_name,' UNION ') as v_sql from t) SELECT ' select distinct loid from pg_largeobject EXCEPT (' || v_sql || ');' from t2; ?column? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); (1 row) postgres=# select distinct loid from pg_largeobject EXCEPT (SELECT distinct image from appointment UNION SELECT distinct image from appointment_acptd UNION SELECT distinct image from appointment_rej); loid ------ (0 rows) |
1 2 3 4 5 6 |
postgres=# ! vacuumlo -v -n postgres Connected to database "test" Test run: no large objects will be removed! Checking image in public.appointment_acptd Checking image in public.appointment Would remove 0 large objects from database "postgres". |
Conclusion
PostgreSQL’s vacuumlo utility plays a crucial role in maintaining database efficiency by assisting in the cleanup of orphaned large objects. This utility enables administrators to reclaim storage space and optimize performance by systematically removing large objects that are no longer associated with any user-defined tables. By regularly employing vacuumlo, DBAs can ensure that their PostgreSQL databases remain streamlined and responsive.
Our PostgreSQL Performance Tuning eBook condenses years of database expertise into a practical guide for optimizing your PostgreSQL databases. Inside, you’ll discover our most effective PostgreSQL performance strategies derived from real-world experience.
Download now: Elevate your PostgreSQL Performance