The table partitioning feature in PostgreSQL has come a long way after the declarative partitioning syntax added to PostgreSQL 10. The partitioning feature in PostgreSQL was first added by PG 8.1 by Simon Rigs, it has based on the concept of table inheritance and using constraint exclusion to exclude inherited tables (not needed) from a query scan. The exclusion constraint will basically do the pruning of tables based on the query filter. Before declarative partitioning https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175f was added to PG, it meant that the child partitions, constraints and triggers etc needed to be created manually which can be cumbersome and could lead to errors. Thanks to the great work from Amit Langote and others, the declarative partitioning feature on PG-10 meant that the user don’t need to create the partitions manually or create the constraints and triggers for routing the rows to the correct partition. All the artefacts required for setting up partitioning would be done by simply creating the partition table and specify the partitions using a standard syntax, this is was great step forward and the one that makes this feature very user friendly.
This blog is about number of enhancements for partitions added to PG-13. I am going to list down all the partitioning enhancements in the blog and and will demonstrate some of them with examples.
Quick introduction to partitioning and timeline of adding partitioning features to PG before we get into the enhancements done in PG-13. Partitioning is way of splitting or dividing a large table into smaller pieces, the table can be divided using the List, Range or Hash partitioning techniques offered by PG. The parent partition table doesn’t store any data, the data is stored in the partitions defined when creating the partitioned table. While the partitions can be accessed directly, the queries are typically directed at the logical parent relation and the tuples are routed to the correct partition for inserts and update and in-case of a read query the desired partitions are scanned for executing the client query.
Here are some key partitioning feature made it to PG over the years…
Benefits of Partitioning
The are several benefits of splitting a large table into smaller pieces, the major benefits are below :
- Query performance can be increased dramatically in situations where only a single partition or small number of partitions need to access to execute a query instead of querying the entire relation.
- If insert and updates are routed to a single partition, the performance can increase significantly due to sequential scan on single partition instead of index or random scan scattered across the table.
- Bulk loads and deletion can be done by adding or removing partitions
- Large tables can be easily managed when the data is logically divided into smaller pieces
Here is complete list of partitioning enhancements added to PG-13, taken from https://www.postgresql.org/docs/13/release-13.html
- Improve cases where pruning of partitions can happen (Yuzuko Hosoya, Amit Langote, Álvaro Herrera)
- Allow partitionwise joins to happen in more cases (Ashutosh Bapat, Etsuro Fujita, Amit Langote, Tom Lane) – For example, partitionwise joins can now happen between partitioned tables even when their partition bounds do not match exactly.
- Allow BEFORE row-level triggers on partitioned tables (Álvaro Herrera) – These triggers cannot change which partition is the destination.
- Allow partitioned tables to be logically replicated via publications (Amit Langote)
- Allow logical replication into partitioned tables on subscribers (Amit Langote) – Previously, subscribers could only receive rows into non-partitioned tables.
- Allow ROW values to be used as partitioning expressions (Amit Langote)
Now lets dig into some of the enhancements from the above list :
Improve Partition-wise join to handle more cases
The first batch of partition wise join feature was added to PG in PG-11, it provides the ability to join two partition tables with matching partition boundary. PG-13 has added modifications in this area by handling more cases, for example when the partition boundary’s don’t match. Here is an example to show this behaviour…
Create the partition tables and set enable_partitionwise_join TO true, these tables will be used in all the examples shown in this blog.
postgres=# drop table foo; DROP TABLE postgres=# CREATE TABLE foo (x int, y date, z int) PARTITION BY RANGE(y); CREATE TABLE postgres=# CREATE TABLE foo_p1 PARTITION OF foo FOR VALUES FROM ('2006-02-01') TO ('2006-03-01'); CREATE TABLE postgres=# CREATE TABLE foo_p2 PARTITION OF foo FOR VALUES FROM ('2006-03-01') TO ('2006-04-01'); CREATE TABLE postgres=# CREATE TABLE foo_p3 PARTITION OF foo FOR VALUES FROM ('2006-04-01') TO ('2006-05-01'); CREATE TABLE postgres=# drop table foo2; DROP TABLE postgres=# CREATE TABLE foo2 (x int, y int, z date) PARTITION BY RANGE(z); CREATE TABLE postgres=# CREATE TABLE foo2_p1 PARTITION OF foo2 FOR VALUES FROM ('2006-02-01') TO ('2006-03-01'); CREATE TABLE postgres=# CREATE TABLE foo2_p2 PARTITION OF foo2 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01'); CREATE TABLE postgres=# postgres=# set enable_partitionwise_join TO true; SET
postgres=# explain (costs off) postgres-# select * from foo f1, foo2 f2 where f1.y = f2.z and f2.z between ('2006-02-01') and ('2006-05-01'); QUERY PLAN --------------------------------------------------------------------------------------- Hash Join Hash Cond: (f1.y = f2.z) -> Append -> Seq Scan on foo_p1 f1 -> Seq Scan on foo_p2 f1_1 -> Seq Scan on foo_p3 f1_2 -> Hash -> Append -> Seq Scan on foo2_p1 f2 Filter: ((z >= '2006-02-01'::date) AND (z <= '2006-05-01'::date)) -> Seq Scan on foo2_p2 f2_1 Filter: ((z >= '2006-02-01'::date) AND (z <= '2006-05-01'::date)) (12 rows)
postgres=# explain (costs off) postgres-# select * from foo f1, foo2 f2 where f1.y = f2.z and f2.z between ('2006-02-01') and ('2006-05-01'); QUERY PLAN --------------------------------------------------------------------------------------- Append -> Hash Join Hash Cond: (f1_1.y = f2_1.z) -> Seq Scan on foo_p1 f1_1 -> Hash -> Seq Scan on foo2_p1 f2_1 Filter: ((z >= '2006-02-01'::date) AND (z <= '2006-05-01'::date)) -> Hash Join Hash Cond: (f1_2.y = f2_2.z) -> Seq Scan on foo_p2 f1_2 -> Hash -> Seq Scan on foo2_p2 f2_2 Filter: ((z >= '2006-02-01'::date) AND (z <= '2006-05-01'::date)) (13 rows)
Before ROW trigger in partitioned table
The support for before ROW tigger on partitioned table is added in PG-13, the user would get an error when trying to create a before ROW trigger on partitioned table in PG-12. The example below shows that a BEFORE ROW trigger can now be created
on partitioned tables. It also illustrates the limitation that such triggers can’t modify the row such that its partition constraint is violated, which could otherwise have been re-routed to the correct partition.
Create the trigger and trigger function.
postgres=# create or replace function trigfunc() returns trigger language plpgsql as $$ begin new.y := new.y + 1; return new; end $$; CREATE FUNCTION postgres=# postgres=# create trigger brtrig before insert on foo for each row execute function trigfunc(); ERROR: "foo" is a partitioned table DETAIL: Partitioned tables cannot have BEFORE / FOR EACH ROW triggers. postgres=#
postgres=# create or replace function trigfunc() returns trigger language plpgsql as $$ begin new.y := new.y + 1; return new; end $$; CREATE FUNCTION postgres=# create trigger brtrig before insert on foo for each row execute function trigfunc(); CREATE TRIGGER The statement below shows that the trigger causes the row to move to another partition and it results in an error. The before ROW trigger function is called on partition table which causes moving the row to another partition and results in an error. postgres=# insert into foo(y) values ('2006-02-28'); ERROR: moving row to another partition during a BEFORE FOR EACH ROW trigger is not supported DETAIL: Before executing trigger "brtrig", the row was to be in partition "public.foo_p1". postgres=#
Allow partitioned table to be replication via Publication
Prior to PG-13, partitions had to be replicated individually. Now partitioned tables can be published explicitly causing all partitions to be automatically published. Addition/removal of partitions from partitioned tables are automatically added/removed from publications. The CREATE PUBLICATION option publish_via_partition_root controls whether changes to partitions are published as their own or their ancestor’s.
“The publish_via_partition_root (boolean) parameter added to CREATE PUBLICATION determines if the changes of partition tables are replicated using their own schema or of its root partitioned table’s. The default is false meaning that the partition changes are published using their own schema. When set true, TRUNCATE operations on the individual partitions are not replicated because doing so as TRUNCATE of the root partitioned table would be wrong.
This would result in an error in an error in PG-12 :
postgres=# create publication pub for table foo; ERROR: "foo" is a partitioned table DETAIL: Adding partitioned tables to publications is not supported. HINT: You can add the table partitions individually. The publication would be created on the individual partitions : postgres=# create publication pub for table foo_p1, foo_p2; CREATE PUBLICATION postgres=# \dRp+ pub Publication pub Owner | All tables | Inserts | Updates | Deletes | Truncates -----------+------------+---------+---------+---------+----------- ahsanhadi | f | t | t | t | t Tables: "public.foo_p1" “public.foo_p2"
postgres=# create publication pub_root for table foo with (publish_via_partition_root = true); CREATE PUBLICATION postgres=# \dRp+ pub_root Publication pub_root Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root -----------+------------+---------+---------+---------+-----------+---------- ahsanhadi | f | t | t | t | t | t Tables: "public.foo"
This blog provided some insight into table partitioning in PG and timeline of how this important feature was added to PostgreSQL. The declarative partitioning syntax added in PG-10 made the feature easy to use and less error prone and also welcomed important performance and other enhancements added to partitioning in later releases. I believe that partitioning syntax in PG is pretty user friendly but it will be even more easier if we allow the partition root and partitioned to be specified in a single statement. This would make it easier to create large partitioned tables and make it even less error prone.
PG-13 added a bunch of important enhancements for partitioning, surely more enhancements will be done in future releases so stay tuned.