Some Interesting statistics about PG-14 contributions

Enterprise PostgreSQL Solutions

Comments are off

Some Interesting statistics about PG-14 contributions

I have spent a few days trolling through the features added to PostgreSQL 14 and in this blog, I want to share some statistics about PG-14 contributions, hopefully, you will find these statistics interesting. Please note that this data is based on my research on PG 14 contributions by going through the GIT LOG, commitfest entires, hackers emails threads, PG-14 release notes on the PostgreSQL website, etc. So please do give me the margin of human error in case the numbers aren’t 100% accurate.

The release notes for PG-14 are here

https://www.postgresql.org/docs/14/release-14.html#id-1.11.6.5.3

PG-14 features/contributions are divided into the following categories, this doesn’t include the changes committed for supporting the migration from previous releases to PG-14. Overall PG-14 has around 167 small, medium and large size features, this is a major PostgreSQL release with an overwhelming number of new features. Compared to PG-13 which had around 137 features, I believe PG-14 is way ahead of PG previous releases in terms of new functionality.

Some of top picks for me

PG-14 has lots of features that can highlighted, the beta release page https://www.postgresql.org/about/news/postgresql-14-beta-1-released-2213/ contains the highlights of the release in the area of performance, data types, replication, security and administration.

While it is difficult to select the top features since the release contains a number of important and large features. PG-14 is a feature-centric release with some features that require a major overhauling/refactoring of the PG codebase.

Here are the top 7 picks for me from PG-14 feature set, I have provided the commitfest entry and some part of the commit message to introduce the feature.

1- Asynchronous execution of PostgreSQL_FDW Append node (https://commitfest.postgresql.org/32/2491/) This implements asynchronous execution, which runs multiple parts of an Append concurrently rather than serially to improve performance when possible.

2- Improving connection scalability: GetSnapshotData() (https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.dehttps://commitfest.postgresql.org/29/2500/) This feature deals with performance issues that occur with workloads that have to deal with a large number of connections. Some of these issues are solved by connection poolers etc but it not enough in most cases.

3- Overhaul UPDATE/DELETE processing (making update/delete of inheritance trees scale better) https://commitfest.postgresql.org/32/2575/ This is a major feature in PG-14 to overhaul the update/delete processing for partitioned tables, the feature modifies the plan producing tuples to be updated emit only the columns that are actually updated.

4- logical streaming for large in-progress transactions (https://commitfest.postgresql.org/29/1927/) This feature add support for streaming of in-progress transactions into the built-in logical replication

5- Add support for multirange data types (https://commitfest.postgresql.org/31/2112/) Support for multirange data types are added in this release, Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them.

6- Allow btree index additions to remove expired index entries to prevent page splits (https://www.postgresql.org/message-id/flat/CAH2-Wzm+maE3apHB8NOtmM=p-DO65j2V5GzAWCOEEuy3JZgb2g@mail.gmail.com) The main benefit of this feature is to reduce the bloat on tables with frequently updated indexes.  It is This is very helpful in reducing index bloat on tables whose indexed columns are frequently updated. It teaches btree to delete old duplicates versions from unique indexes.

7- PostgreSQL_FDW Batching (https://commitfest.postgresql.org/31/2620/) One of the issues regularly reminded by users/customers is that inserting into tables sharded using FDWs are rather slow. Some of the slowness/overhead is expected, due to the latency between machines in the sharded setup.

PostgreSQL Contributions for PG-14 (Breakdown by Author)

This chart contains the breakdown of PG-14 contributions by the Author, the data for the chart is collected from the PG-14 release notes. This doesn’t include the fixes added to PG-14 to backward compatibility and aid migration from the previous release. PG-14 release notes mention the main Author(s) for every contribution, the data for the chart is plotted using this information.

It is important to note that while this chart provides data about the number of contributions for PG-14 breakdown by Author, it is not a reflection upon the amount of effort put in by the contributor. For example some of the features that have gone in PG-14 are really heavy lifting like overhauling of update/delete processing, asynchronous execution, improving connection scalability etc where someone might have contributed several features that amounts to less code and less complexity.

PostgreSQL Contributions for PG-14 (Breakdown by Company)

This chart contains PG-14 contributions breakdown by Author’s Company, the data for the chart is collected from the PG-14 release notes. The Author company information is acquired from different resources. EnterpriseDB tops the charts for PG-14 contributions, the merger with 2ndQuadrant certainly helps that. It is followed Crunchy Data, NTT, Fujutsu and other companies.

The same principle applied here, it is possible that a company only has few contributions but their contributions are major large size features.

PostgreSQL Contributions for PG-14 (Breakdown by Country)

This chart contains PG-14 contributions breakdown by Country, the data for the chart is collected from the PG-14 release notes. The country is most cases the country of the Author or the country where the Author’s company resides.

Conclusion

In this blog, I have tried to share some knowledge that i gained while researching the PG-14 contributions. I want to reiterate that PG-14 is a great PG release as it brings a vast variety of important features for performance, monitoring, administration, replication etc etc. I am sure PG customers will be excited to start using this awesome release.

I am sure there is much more exciting and robust features to come in the upcoming releases.