Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

PostgreSQL Sharding: An Overview and MongoDB Comparison

January 24, 2023

Author

Fernando Laudares Camargos

Fernando Laudares Camargos| Jobin Augustine| Avinash Vallarapu

MongoDB

PostgreSQL

Share this Post:

This post was originally published in 2019 and was updated in 2023.

I presented at Percona University São Paulo about the new features in PostgreSQL that allow the deployment of simple shards. I’ve tried to summarize the main points in this post, as well as provide an introductory overview of sharding itself. Please note I haven’t included any third-party extensions that provide PostgreSQL sharding in my discussion below.

PostgreSQL Partitioning

In a nutshell, until not long ago, there wasn’t a dedicated, native feature in PostgreSQL for table partitioning. Not that that prevented people from doing it anyway: the PostgreSQL community is very creative. There’s a table inheritance feature in PostgreSQL that allows the creation of child tables with the same structure as a parent table. That, combined with the employment of proper constraints in each child table and the right set of triggers in the parent table, has provided practical “table partitioning” in PostgreSQL for years (and still works). Here’s an example:

Using table inheritance

CREATE TABLE temperature (
  id BIGSERIAL PRIMARY KEY NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
);

CREATE TABLE temperature (

id BIGSERIAL PRIMARY KEY NOT NULL,

city_id INT NOT NULL,

timestamp TIMESTAMP NOT NULL,

temp DECIMAL(5,2) NOT NULL

);

Figure 1a. Main (or parent) table

CREATE TABLE temperature_201901 (CHECK (timestamp >= DATE '2019-01-01' AND timestamp <= DATE '2019-01-31')) INHERITS (temperature);
CREATE TABLE temperature_201902 (CHECK (timestamp >= DATE '2019-02-01' AND timestamp <= DATE '2019-02-28')) INHERITS (temperature);
CREATE TABLE temperature_201903 (CHECK (timestamp >= DATE '2019-03-01' AND timestamp <= DATE '2019-03-31')) INHERITS (temperature);

CREATE TABLE temperature_201901 (CHECK (timestamp >= DATE '2019-01-01' AND timestamp <= DATE '2019-01-31')) INHERITS (temperature);

CREATE TABLE temperature_201902 (CHECK (timestamp >= DATE '2019-02-01' AND timestamp <= DATE '2019-02-28')) INHERITS (temperature);

CREATE TABLE temperature_201903 (CHECK (timestamp >= DATE '2019-03-01' AND timestamp <= DATE '2019-03-31')) INHERITS (temperature);

Figure 1b. Child tables inherit the structure of the parent table and are limited by constraints

CREATE OR REPLACE FUNCTION temperature_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF ( NEW.timestamp >= DATE '2019-01-01' AND NEW.timestamp <= DATE '2019-01-31' ) THEN INSERT INTO temperature_201901 VALUES (NEW.*);
    ELSIF ( NEW.timestamp >= DATE '2019-02-01' AND NEW.timestamp <= DATE '2019-02-28' ) THEN INSERT INTO temperature_201902 VALUES (NEW.*);
    ELSIF ( NEW.timestamp >= DATE '2019-03-01' AND NEW.timestamp <= DATE '2019-03-31' ) THEN INSERT INTO temperature_201903 VALUES (NEW.*);
    ELSE RAISE EXCEPTION 'Date out of range!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION temperature_insert_trigger()

RETURNS TRIGGER AS $$

BEGIN

IF ( NEW.timestamp >= DATE '2019-01-01' AND NEW.timestamp <= DATE '2019-01-31' ) THEN INSERT INTO temperature_201901 VALUES (NEW.*);

ELSIF ( NEW.timestamp >= DATE '2019-02-01' AND NEW.timestamp <= DATE '2019-02-28' ) THEN INSERT INTO temperature_201902 VALUES (NEW.*);

ELSIF ( NEW.timestamp >= DATE '2019-03-01' AND NEW.timestamp <= DATE '2019-03-31' ) THEN INSERT INTO temperature_201903 VALUES (NEW.*);

ELSE RAISE EXCEPTION 'Date out of range!';

END IF;

RETURN NULL;

END;

LANGUAGE plpgsql;

Figure 1c. A function that controls in which child table a new entry should be added according to the timestamp field

CREATE TRIGGER insert_temperature_trigger
    BEFORE INSERT ON temperature
    FOR EACH ROW EXECUTE PROCEDURE temperature_insert_trigger();

CREATE TRIGGER insert_temperature_trigger

BEFORE INSERT ON temperature

FOR EACH ROW EXECUTE PROCEDURE temperature_insert_trigger();

Figure 1d. A trigger is added to the parent table that calls the function above when an INSERT is performed.

The biggest drawbacks for such an implementation were related to the amount of manual work needed to maintain such an environment (even though a certain level of automation could be achieved through the use of 3rd party extensions such as pg_partman) and the lack of optimization/support for “distributed” queries. The PostgreSQL optimizer wasn’t advanced enough to have a good understanding of partitions at the time, though there were workarounds that could be used, such as employing constraint exclusion.

Declarative partitioning

About 1.5 years ago, PostgreSQL 10 was released with many new features, including native support for table partitioning through the new declarative partitioning feature. Here’s how we could partition the same temperature table using this new method:

CREATE TABLE temperature (
  id BIGSERIAL NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
) PARTITION BY RANGE (timestamp);

CREATE TABLE temperature (

id BIGSERIAL NOT NULL,

city_id INT NOT NULL,

timestamp TIMESTAMP NOT NULL,

temp DECIMAL(5,2) NOT NULL

) PARTITION BY RANGE (timestamp);

Figure 2a. Main table structure for a partitioned table

CREATE TABLE temperature_201901 PARTITION OF temperature FOR VALUES FROM ('2019-01-01') TO ('2019-02-01');
CREATE TABLE temperature_201902 PARTITION OF temperature FOR VALUES FROM ('2019-02-01') TO ('2019-03-01');
CREATE TABLE temperature_201903 PARTITION OF temperature FOR VALUES FROM ('2019-03-01') TO ('2019-04-01');

CREATE TABLE temperature_201901 PARTITION OF temperature FOR VALUES FROM ('2019-01-01') TO ('2019-02-01');

CREATE TABLE temperature_201902 PARTITION OF temperature FOR VALUES FROM ('2019-02-01') TO ('2019-03-01');

CREATE TABLE temperature_201903 PARTITION OF temperature FOR VALUES FROM ('2019-03-01') TO ('2019-04-01');

Figure 2b. Tables defined as partitions of the main table; with declarative partitioning, there was no need for triggers anymore.

It still missed the greater optimization and flexibility needed to consider it a complete partitioning solution. It wasn’t possible, for example, to perform an UPDATE that would result in moving a row from one partition to a different one, but the foundation had been laid. Fast forward another year, and PostgreSQL 11 builds on top of this, delivering additional features like:

- the possibility to define a default partition, to which any entry that wouldn’t fit a corresponding partition would be added to.

- having indexes added to the main table “replicated” to the underlying partitions, which improved declarative partitioning usability.

- support for Foreign Keys

These are just a few of the features that led to a more mature partitioning solution.

PostgreSQL Sharding

By now, you might be reasonably questioning my premise and that partitioning is not sharding, at least not in the sense and context you would have expected this post to cover. In fact, PostgreSQL has implemented sharding on top of partitioning by allowing any given partition of a partitioned table to be hosted by a remote server. The basis for this is in PostgreSQL’s Foreign Data Wrapper (FDW) support, which has been a part of the core of PostgreSQL for a long time. While technically possible to implement, we couldn’t make practical use of it for sharding using the table inheritance + triggers approach. Declarative partitioning allowed for much better integration of these pieces making sharding – partitioned tables hosted by remote servers – more of a reality in PostgreSQL.

CREATE TABLE temperature_201904 (
  id BIGSERIAL NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
);

CREATE TABLE temperature_201904 (

id BIGSERIAL NOT NULL,

city_id INT NOT NULL,

timestamp TIMESTAMP NOT NULL,

temp DECIMAL(5,2) NOT NULL

);

Figure 3a. On the remote server, we create a “partition” – nothing but a simple table.

CREATE EXTENSION postgres_fdw;
GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw to app_user;
CREATE SERVER shard02 FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (dbname 'postgres', host 'shard02', port 
    '5432');
CREATE USER MAPPING for app_user SERVER shard02 
    OPTIONS (user 'fdw_user', password 'secret');

CREATE EXTENSION postgres_fdw;

GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw to app_user;

CREATE SERVER shard02 FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (dbname 'postgres', host 'shard02', port

'5432');

CREATE USER MAPPING for app_user SERVER shard02

OPTIONS (user 'fdw_user', password 'secret');

Figure 3b. On the local server, the preparatory steps involve loading the postgres_fdw extension, allowing our local application user to use that extension, creating an entry to access the remote server, and finally mapping that user with a user in the remote server (fdw_user) that has local access to the table we’ll use as a remote partition.

CREATE FOREIGN TABLE temperature_201904 PARTITION OF temperature
    FOR VALUES FROM ('2019-04-01') TO ('2019-05-01')
    SERVER remoteserver01;

CREATE FOREIGN TABLE temperature_201904 PARTITION OF temperature

FOR VALUES FROM ('2019-04-01') TO ('2019-05-01')

SERVER remoteserver01;

Figure 3c. Now it’s simply a matter of creating a proper partition of our main table in the local server that will be linked to the table of the same name in the remote server.

You can read more about postgres_fdw in Foreign Data Wrappers in PostgreSQL and a closer look at postgres_fdw.

When does it make sense to partition a table?

There are several principal reasons to partition a table:

1. When a table grows so big that searching it becomes impractical even with the help of indexes (which will invariably become too big as well).

1. When data management is such that the target data is often the most recently added and/or older data is constantly being purged/archived, or even not being searched anymore (at least not as often).

1. If you are loading data from different sources and maintaining it as a data warehousing for reporting and analytics.

1. For a less expensive archiving or purging of massive data that avoids exclusive locks on the entire table.

When should we resort to sharding?

Here are a couple of classic cases:

1. To scale out (horizontally), when even after partitioning a table, the amount of data is too great or too complex to be processed by a single server.

1. Use cases where the data in a big table can be divided into two or more segments that would benefit the majority of the search patterns. A common example of a scenario like this is that of a company whose customers are evenly spread across the United States and whose searches to a target table involves the customer ZIP code. A shard could then be used to host entries of customers located on the East Coast and another for customers on the West Coast.

1. Sharding can also be used to reduce the complexity of queries by dividing a large table into smaller, more manageable pieces. This can lead to faster query response times and improved performance for the end user.

1. In some cases, sharding can be used to address regulatory or compliance requirements by physically separating data that has different levels of sensitivity or access requirements. This can help organizations meet their legal obligations and ensure that data is stored and accessed appropriately.

Note, though, this is by no means an extensive list.

Ensure your databases are performing their best — today and tomorrow — with proactive database optimization and query tuning. Book a database assessment

How should we shard the data?

With sharding (in this context) being “distributed” partitioning, the essence of a successful (performant) sharded environment lies in choosing the right shard key – and by “right,” I mean one that will distribute your data across the shards in a way that will benefit most of your queries. In the example above, using the customer ZIP code as shard key makes sense if an application will more often be issuing queries that will hit one shard (East) or the other (West). However, if most queries would filter by, say, birth date, then all queries would need to be run through all shards to recover the full result set. This could easily backfire on performance with the shard approach, by not selecting the right shard key or simply by having such a heterogeneous workload that no shard key would be able to satisfy it.

It only ever makes sense to shard if the nature of the queries involving the target table(s) is such that distributed processing will be the norm and constitute an advantage far greater than any overhead caused by a minority of queries that rely on JOINs involving multiple shards. Due to the distributed nature of sharding, such queries will necessarily perform worse if compared to having them all hosted on the same server.

Learn more about sharding keys in our blog, Choosing a good sharding key in MongoDB (and MySQL).

Why not simply rely on replication or clustering?

Sharding should be considered when you can’t efficiently break down a big table through data normalization or use an alternative approach. Maintaining it on a single server is too demanding. The table is then partitioned, and the partitions distributed across different servers to spread the load across many servers. It doesn’t need to be one partition per shard; often, a single shard will host a number of partitions.

Note how sharding differs from traditional “share all” database replication and clustering environments: you may use, for instance, a dedicated PostgreSQL server to host a single partition from a single table and nothing else. However, these data scaling technologies may well complement each other: a PostgreSQL database may host a shard with part of a big table as well as replicate smaller tables that are often used for some sort of consultation (read-only), such as a price list, through logical replication.

What are some best practices for implementing PostgreSQL sharding?

Implementing sharding in PostgreSQL can be a complex process, but some best practices can help ensure success. Here are some important considerations and practices to follow:

1. Evaluate the need for sharding: Before implementing sharding in PostgreSQL, evaluate whether it is necessary for your application. Sharding can help improve performance and scalability, but it also adds complexity to your database architecture. Consider your application’s workload and growth projections to determine whether sharding is necessary.

1. Choose a sharding strategy: There are several sharding strategies to choose from, including horizontal, vertical, and functional sharding. Each strategy has its own advantages and disadvantages, so choose the one that best fits your application’s needs. Consider factors such as data distribution, query patterns, and hardware requirements when choosing a sharding strategy.

1. Plan for data migration: Migrating data to a sharded database can be a complex and time-consuming process, so develop a plan for migrating your data that includes testing and validation to ensure data integrity. Consider using tools such as pg_dump and pg_restore to simplify the migration process.

1. Backup and recovery strategies: Sharding can add complexity to backup and recovery processes, so it’s important to plan for these processes in advance. Consider using a backup tool that supports sharded databases. Develop a recovery plan to ensure that data can be recovered in the event of a disaster.

1. Test thoroughly: Before deploying a sharded database into production, it is essential to test it thoroughly. Test the sharding strategy, the application architecture, and disaster recovery procedures, as mentioned above.

1. Monitor and optimize performance: Sharding can help improve performance, but it also introduces new performance challenges. Monitor your database performance regularly and optimize your database as necessary. Consider using tools such as pg_stat_statements and pgAdmin to help identify performance issues and optimize your database.

Optimize your PostgreSQL database performance and scalability by implementing sharding with Percona Distribution for PostgreSQL.

How does PostgreSQL sharding relate to MongoDB sharding?

MongoDB® tackles the matter of managing big collections straight through sharding: there is no concept of local partitioning of collections in MongoDB. In fact, the whole MongoDB scaling strategy is based on sharding, which takes a central place in the database architecture. As such, the MongoDB sharding process has been made as transparent to the application as possible: all a DBA has to do is define the shard key.

Instead of connecting to a reference database server, the application will connect to an auxiliary router server named mongos, which will process the queries and request the necessary information to the respective shard. It knows which shard contains what because they maintain a copy of the metadata that maps chunks of data to shards, which they get from a config server, another important and independent component of a MongoDB sharded cluster. Together, they also play a role in maintaining good data distribution across the shards, actively splitting and migrating chunks of data between servers as needed.

In PostgreSQL, the application will connect and query the main database server. There isn’t an intermediary router like the mongos, but PostgreSQL’s query planner will process the query and create an execution plan. When data requested from a partitioned table is found on a remote server, PostgreSQL will request the data from it, as shown in the EXPLAIN output below:

…
   Remote SQL: UPDATE public.emp SET sal = $2 WHERE ctid = $1
   ->  Nested Loop  (cost=100.00..300.71 rows=669 width=118)
     	Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, (emp.sal * '1.1'::double precision), emp.comm, emp.deptno, emp.ctid, salgrade.ctid
     	Join Filter: ((emp.sal > (salgrade.losal)::double precision) AND (emp.sal < (salgrade.hisal)::double precision)) ->  Foreign Scan on public.emp  (cost=100.00..128.06 rows=602 width=112)
           	Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, emp.sal, emp.comm, emp.deptno, emp.ctid
…

…

Remote SQL: UPDATE public.emp SET sal = $2 WHERE ctid = $1

-> Nested Loop (cost=100.00..300.71 rows=669 width=118)

Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, (emp.sal * '1.1'::double precision), emp.comm, emp.deptno, emp.ctid, salgrade.ctid

Join Filter: ((emp.sal > (salgrade.losal)::double precision) AND (emp.sal < (salgrade.hisal)::double precision)) -> Foreign Scan on public.emp (cost=100.00..128.06 rows=602 width=112)

Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, emp.sal, emp.comm, emp.deptno, emp.ctid

…

Figure 4: excerpt of an EXPLAIN plan that involves processing a query in a remote server.

Note in the above query the mention of “Remote SQL.” Many optimizations have been made in the execution of remote queries in PostgreSQL 10 and 11, which contributed to maturing and improving the sharding solution. Among them is support for having grouping and aggregation operations executed on the remote server itself (“push down”) rather than recovering all rows and processing them locally.

PostgreSQL Performance Tuning: Optimizing Database Parameters for Maximum Efficiency

What is missing in PostgreSQL implementation?

There is, however, still room for improvement. Regarding remote execution, reports from the community indicate not all queries are performing as they should. For example, in some cases, the PostgreSQL planner is not performing a full push-down, resulting in shards transferring more data than required. Parallel scheduling of queries that touch multiple shards is not yet implemented: for now, the execution is taking place sequentially, one shard at a time, which takes longer to complete. When it comes to maintaining partitioned and sharded environments, changes in the structure of partitions are still complicated and impractical. For example, when you add a new partition to a partitioned table with an appointed default partition, you may need to detach the default partition first if it contains rows that would now fit in the new partition, manually move those to the new partition, and finally, re-attach the default partition back in place.

But that is all part of a maturing technology. We’re looking forward to PostgreSQL 12 and what it will bring in the partitioning and sharding fronts.

Avoid costly mistakes in your PostgreSQL strategy

Running PostgreSQL at scale isn’t just about sharding, partitioning, or performance tweaks; it’s about making choices that won’t trap you with hidden costs or vendor lock-in down the line. Too many organizations discover these lessons the hard way.

That’s why we created the Enterprise PostgreSQL Buyer’s Guide. It captures the top mistakes IT leaders make, with clear steps to avoid them, so you can build a PostgreSQL environment that’s truly enterprise-ready.

Get the Enterprise PostgreSQL Buyer’s Guide

0 0 votes

Article Rating

5 Comments

Oldest

Newest Most Voted

Kay Agahd

7 years ago

Well written and very interesting, thank you! However, you write:
“It only ever makes sense to shard if the nature of the queries involving the target table(s) is such that distributed processing will be the norm […] Due to the distributed nature of sharding such queries will necessarily perform worse if compared to having them all hosted on the same server.”
While I fully understand your point, I wonder why it shouldn’t be beneficial to have less data on each shard. So even if the query hits every shard, each shard has to work through fewer data (for 10 shards only one-tenth). Of course, to be beneficial, it requires that the query is sent to all shards in parallel, which is not yet implemented in PostgreSQL as you wrote at the very end of your article – but I think it will be implemented in a near future and already is the case for mongoDB since its early days.

Anthony

7 years ago

Do you known the extension Citus ? https://www.citusdata.com/

6 years ago

Thanks for this info. Quite useful.

Jeremy Holovacs

3 years ago

Very interesting. I hope the future of sharding will allow more dynamic assignments. In the SaaS space, regulatory commitments require most services that process PII or other customer data to remove such data from their system completely in a timely manner, and I don’t see this business need being well addressed in the market today.

Consider a multi-tenant SaaS solution where you can include a tenant_id column on every table. We should be able to set “defaults” for db storage that allows for the creation of a new partition in a new remote file to store this data. Likewise, should the customer terminate the relationship, it should be a simple matter of dropping that remote file… but we’re still pretty far from that, it looks like.