So far we have discussed scalability, what is scalability, why and when you need and what are the different types of scalability. Now we are starting to get into the meat of this topic and will discuss declarative partitioning and sharding in PostgreSQL. The sharding functionality is being laid on top of declarative partitioning functionality in PostgreSQL.
Declarative partitioning was released in PostgreSQL 10, prior to declarative partitioning PostgreSQL was using table inheritance and plpgsql triggers for providing table partitioning. The example below shows how a table can be partitioned using the declarative partitioning syntax introduced in PG 10 :
Declarative partitioning basically provides the native support for partitioning in PostgresSQL, using the syntax used in the example above, the user can create partitioned tables. This would divide a table into pieces called partitions, the pieces are called partitioned tables. All rows inserted into a partitioned table will be routed to one of the partition based on the partition key.
Lot of performance improvement for declarative partitioning was added in PostgreSQL 11 that improved the code for partition pruning, partition pruning is the ability of eliminating certain partitions from the search based on the quals provided in the WHERE predicate.
Sharding in PostgreSQL
Sharding is the ability to partition a table across one or more foreign servers, with declarative partitioning as show above the table can partitioned into multiple partitioned tables living on the same database server. Sharding allows the table to be partitioned in a way that the partitions live on external foreign servers and the parent table lives on the primary node where the user is creating the sharded table. All the foreign servers that are being used in the sharded tables are PostgreSQL foreign servers, other foreign server i.e. MongoDB, MySQL etc are not supported.
The example below shows how a sharded table can be created in PostgreSQL today, we will then talk about the approach/architecture that community is following in order to add sharding. We will discuss what’s already done for built-in sharding and what the important pieces remaining and also highlight the challenges.
This is the main parent table that is created on the main server.
On the remote server, you simply create a partitioned table. This is corresponding to the parent partition table that was created in the primary server as shown below. In this example the foreign server is shard1.
On the main server the above steps of creating the postgres_fdw extension with appropriate permissions, creating the foreign server where the partitioned table will be created and creating the user mapping needs to be carried out.
On the main server the partitioned table is created as shown above, the difference between a normal partitioned table and this one is that we are specifying the foreign server. In this case the foreign server is shard_1 where we have created the partitioned table.
Using the example above the user can create a sharded table where the partitions are living on a foreign server. Please note that the partitions need to created manually on the foreign servers. Once this is setup, all the queries will routed to there specific partitions using the partition pruning logic.
Ahsan Hadi is a VP of Development with HighGo Software Inc. Prior to coming to HighGo Software, Ahsan had worked at EnterpriseDB as a Senior Director of Product Development, Ahsan worked with EnterpriseDB for 15 years. The flagship product of EnterpriseDB is Postgres Plus Advanced server which is based on Open source PostgreSQL. Ahsan has vast experience with Postgres and has lead the development team at EnterpriseDB for building the core compatibility of adding Oracle compatible layer to EDB’s Postgres Plus Advanced Server. Ahsan has also spent number of years working with development team for adding Horizontal scalability and sharding to Postgres. Initially, he worked with postgres-xc which is multi-master sharded cluster and later worked on managing the development of adding horizontal scalability/sharding to Postgres. Ahsan has also worked a great deal with Postgres foreign data wrapper technology and worked on developing and maintaining FDW’s for several sql and nosql databases like MongoDB, Hadoop and MySQL.
Prior to EnterpriseDB, Ahsan worked for Fusion Technologies as a Senior Project Manager. Fusion Tech was a US based consultancy company, Ahsan lead the team that developed java based job factory responsible for placing items on shelfs at big stores like Walmart. Prior to Fusion technologies, Ahsan worked at British Telecom as a Analyst/Programmer and developed web based database application for network fault monitoring.
Ahsan joined HighGo Software Inc (Canada) in April 2019 and is leading the development teams based in multiple Geo’s, the primary responsibility is community based Postgres development and also developing HighGo Postgres server.