So far we have discussed scalability, what is scalability, why and when you need and what are the different types of scalability. Now we are starting to get into the meat of this topic and will discuss declarative partitioning and sharding in PostgreSQL. The sharding functionality is being laid on top of declarative partitioning functionality in PostgreSQL.
Declarative partitioning was released in PostgreSQL 10, prior to declarative partitioning PostgreSQL was using table inheritance and plpgsql triggers for providing table partitioning. The example below shows how a table can be partitioned using the declarative partitioning syntax introduced in PG 10 :
Declarative partitioning basically provides the native support for partitioning in PostgresSQL, using the syntax used in the example above, the user can create partitioned tables. This would divide a table into pieces called partitions, the pieces are called partitioned tables. All rows inserted into a partitioned table will be routed to one of the partition based on the partition key.
Lot of performance improvement for declarative partitioning was added in PostgreSQL 11 that improved the code for partition pruning, partition pruning is the ability of eliminating certain partitions from the search based on the quals provided in the WHERE predicate.
Sharding in PostgreSQL
Sharding is the ability to partition a table across one or more foreign servers, with declarative partitioning as show above the table can partitioned into multiple partitioned tables living on the same database server. Sharding allows the table to be partitioned in a way that the partitions live on external foreign servers and the parent table lives on the primary node where the user is creating the sharded table. All the foreign servers that are being used in the sharded tables are PostgreSQL foreign servers, other foreign server i.e. MongoDB, MySQL etc are not supported.
The example below shows how a sharded table can be created in PostgreSQL today, we will then talk about the approach/architecture that community is following in order to add sharding. We will discuss what’s already done for built-in sharding and what the important pieces remaining and also highlight the challenges.
This is the main parent table that is created on the main server.
On the remote server, you simply create a partitioned table. This is corresponding to the parent partition table that was created in the primary server as shown below. In this example the foreign server is shard1.
On the main server the above steps of creating the postgres_fdw extension with appropriate permissions, creating the foreign server where the partitioned table will be created and creating the user mapping needs to be carried out.
On the main server the partitioned table is created as shown above, the difference between a normal partitioned table and this one is that we are specifying the foreign server. In this case the foreign server is shard_1 where we have created the partitioned table.
Using the example above the user can create a sharded table where the partitions are living on a foreign server. Please note that the partitions need to created manually on the foreign servers. Once this is setup, all the queries will routed to there specific partitions using the partition pruning logic.