Downloads

Blog

SQL Optimizations in PostgreSQL: IN vs EXISTS vs ANY/ALL vs JOIN

April 16, 2020

Author

Jobin Augustine

Insight for DBAs

Insight for Developers

PostgreSQL

Share this Post:

SQL optimizations in PostgreSQL
This is one of the most common questions asked by developers writing SQL queries against PostgreSQL. There are multiple ways to structure subqueries or lookups, and PostgreSQL’s optimizer is quite effective at transforming queries for better performance.

Let’s walk through an example using the pgbench schema.

Note: pgbench is a benchmarking tool included with PostgreSQL. You can initialize sample data with:

pgbench -i -s 10

1	pgbench -i -s 10

Update some sample data:

update pgbench_branches set bbalance=4500000 where bid in (4,7);

1	update pgbench_branches set bbalance=4500000 where bid in (4,7);

Inclusion Queries

Goal: Find the number of accounts per branch where branch balance is greater than zero.

1. Using IN

SELECT count(aid),bid FROM pgbench_accounts
WHERE bid IN (SELECT bid FROM pgbench_branches WHERE bbalance > 0)
GROUP BY bid;

SELECT count(aid),bid FROM pgbench_accounts

WHERE bid IN (SELECT bid FROM pgbench_branches WHERE bbalance > 0)

GROUP BY bid;

2. Using ANY

SELECT count(aid),bid FROM pgbench_accounts
WHERE bid = ANY(SELECT bid FROM pgbench_branches WHERE bbalance > 0)
GROUP BY bid;

SELECT count(aid),bid FROM pgbench_accounts

WHERE bid = ANY(SELECT bid FROM pgbench_branches WHERE bbalance > 0)

GROUP BY bid;

3. Using EXISTS

SELECT count(aid),bid
FROM pgbench_accounts
WHERE EXISTS (
  SELECT bid FROM pgbench_branches
  WHERE bbalance > 0
  AND pgbench_accounts.bid = pgbench_branches.bid
)
GROUP BY bid;

SELECT count(aid),bid

FROM pgbench_accounts

WHERE EXISTS (

SELECT bid FROM pgbench_branches

WHERE bbalance > 0

AND pgbench_accounts.bid = pgbench_branches.bid

)

GROUP BY bid;

4. Using INNER JOIN

SELECT count(aid),a.bid
FROM pgbench_accounts a
JOIN pgbench_branches b ON a.bid = b.bid
WHERE b.bbalance > 0
GROUP BY a.bid;

SELECT count(aid),a.bid

FROM pgbench_accounts a

JOIN pgbench_branches b ON a.bid = b.bid

WHERE b.bbalance > 0

GROUP BY a.bid;

PostgreSQL produces the same execution plan for all of these approaches.

HashAggregate
  -> Hash Join
       -> Seq Scan on pgbench_accounts
       -> Seq Scan on pgbench_branches (Filter: bbalance > 0)

HashAggregate

-> Hash Join

-> Seq Scan on pgbench_accounts

-> Seq Scan on pgbench_branches (Filter: bbalance > 0)

This means you can typically choose the syntax you prefer.

Exclusion Queries

Goal: Find accounts per branch excluding branches with positive balances.

1. Using NOT IN

SELECT count(aid),bid FROM pgbench_accounts
WHERE bid NOT IN (SELECT bid FROM pgbench_branches WHERE bbalance > 0)
GROUP BY bid;

SELECT count(aid),bid FROM pgbench_accounts

WHERE bid NOT IN (SELECT bid FROM pgbench_branches WHERE bbalance > 0)

GROUP BY bid;

2. Using <> ALL

SELECT count(aid),bid FROM pgbench_accounts
WHERE bid <> ALL(SELECT bid FROM pgbench_branches WHERE bbalance > 0)
GROUP BY bid;

SELECT count(aid),bid FROM pgbench_accounts

WHERE bid <> ALL(SELECT bid FROM pgbench_branches WHERE bbalance > 0)

GROUP BY bid;

3. Using NOT EXISTS

SELECT count(aid),bid
FROM pgbench_accounts
WHERE NOT EXISTS (
  SELECT bid FROM pgbench_branches
  WHERE bbalance > 0
  AND pgbench_accounts.bid = pgbench_branches.bid
)
GROUP BY bid;

SELECT count(aid),bid

FROM pgbench_accounts

WHERE NOT EXISTS (

SELECT bid FROM pgbench_branches

WHERE bbalance > 0

AND pgbench_accounts.bid = pgbench_branches.bid

)

GROUP BY bid;

4. Using LEFT JOIN

SELECT count(aid),a.bid
FROM pgbench_accounts a
LEFT JOIN pgbench_branches b
  ON a.bid = b.bid AND b.bbalance > 0
WHERE b.bid IS NULL
GROUP BY a.bid;

SELECT count(aid),a.bid

FROM pgbench_accounts a

LEFT JOIN pgbench_branches b

ON a.bid = b.bid AND b.bbalance > 0

WHERE b.bid IS NULL

GROUP BY a.bid;

NOT EXISTS and LEFT JOIN produce better execution plans (hash anti-joins), while NOT IN and <> ALL may generate subplans.

Large Subquery Considerations

With small datasets, PostgreSQL optimizes NOT IN well using hashed subplans. But with large subqueries, performance degrades significantly:

CREATE TABLE t1 AS SELECT * FROM generate_series(0, 500000) id;
CREATE TABLE t2 AS SELECT (random() * 4000000)::integer id FROM generate_series(0, 4000000);

EXPLAIN SELECT id FROM t1 WHERE id NOT IN (SELECT id FROM t2);

CREATE TABLE t1 AS SELECT * FROM generate_series(0, 500000) id;

CREATE TABLE t2 AS SELECT (random() * 4000000)::integer id FROM generate_series(0, 4000000);

EXPLAIN SELECT id FROM t1 WHERE id NOT IN (SELECT id FROM t2);

This results in expensive materialization and poor performance.

Datatype Conversion Considerations

Different syntax can introduce implicit casts:

EXPLAIN ANALYZE SELECT * FROM emp WHERE gen = ANY(ARRAY['M','F']);

1	EXPLAIN ANALYZE SELECT * FROM emp WHERE gen = ANY(ARRAY['M','F']);

This may cast values to text, adding overhead.

Using IN avoids unnecessary casting:

SELECT * FROM emp WHERE gen IN ('M','F');

1	SELECT * FROM emp WHERE gen IN ('M','F');

Summary

PostgreSQL often optimizes different query styles into the same plan
EXISTS and JOIN are generally safer for exclusion queries
IN works well for small subqueries but can degrade with large datasets
Be aware of implicit datatype conversions
Always validate with EXPLAIN

General approach:

Identify required tables
Determine joins
Minimize rows in joins

Never assume performance based on small datasets—test at scale.

Our white paper “Why Choose PostgreSQL?” explores features, benefits, and migration strategies.

Download PDF

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Resources

Blog

MySQL

April 13, 2026

Auditing Login Attempts in MySQL and MariaDB

Blog

MongoDB

April 7, 2026

Percona ClusterSync for MongoDB 0.8.0: Up to 18x Faster Change Replication

Blog

PostgreSQL

April 1, 2026

Percona Operator for PostgreSQL 2.9.0: PostgreSQL 18 Default, PVC Snapshot Backups, LDAP Support, and More!

Far
Enough.

Said no pioneer ever.

Get Started

Open source database software from experts who stand with you in production. Forever free from lock-in and other corporate BS.

Connect

Privacy

Legal

Security Center

MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.

SQL Optimizations in PostgreSQL: IN vs EXISTS vs ANY/ALL vs JOIN

Inclusion Queries

1. Using IN

2. Using ANY

3. Using EXISTS

4. Using INNER JOIN

Exclusion Queries

1. Using NOT IN

2. Using <> ALL

3. Using NOT EXISTS

4. Using LEFT JOIN

Large Subquery Considerations

Datatype Conversion Considerations

Summary

Auditing Login Attempts in MySQL and MariaDB

Percona ClusterSync for MongoDB 0.8.0: Up to 18x Faster Change Replication

Percona Operator for PostgreSQL 2.9.0: PostgreSQL 18 Default, PVC Snapshot Backups, LDAP Support, and More!

Far Enough.

Far
Enough.