There is a lot of interest and discussions lately in the PostgreSQL world to make it a scale-out solution. Among other possible solutions, one of the most promising ones is to implement the sharding using FDW and table partitioning for distributing the data on multiple servers. As of now, PostgreSQL can only fetch the data from FDW in serial and that is one thing that needs to be improved to increase the performance. Other than that performance improvement currently we are missing two major features in PostgreSQL to make it full fledge distributed database. These are related to the ACID property, atomic commit, and atomic visibility.
This article is to explain what is atomic commit and atomic visibility and why we need it in PostgreSQL to make it a full fledge scale-out database solution.
When we talk about database system that consists of multiple servers the first thing that comes in mind is two-phase commit protocol (2PC). Using a 2PC protocol we can achieve an atomic commit for transactions involving multiple servers (distributed transactions). As the name of the two-phase commit protocol suggests that, 2PC commits the transaction in two steps rather that one. In the first step, the transaction is prepared across all the participating servers and depending upon the result of prepare the second step performs the commit prepared or rollback prepared. Dividing the commit into two steps helps in building the consensus among all nodes if every node can commit the transaction or not, before performing the actual commit. But unlike simple transactions, the prepared transactions have a very interesting property that they are not bound to the database session and keep persisting even when the session that created that transaction dies.
This very property of the prepared transactions makes them very powerful and usable for 2PC. As to implement 2PC we don’t want to see a prepared transaction vanished (because of session disconnection or even a server crash) once after we get confirmation from the server that it is good to commit the transaction ( transaction prepared successfully). At the same time, this session disassociation property of prepared transactions comes with its own management overheads. As we all know what kind of disasters the long-running transactions can cause so we want to make sure in case of session disconnection or server crash we don’t leave any prepared transaction unattended we always require a transaction manager or let’s call it a global transaction manager when working with 2PC and prepared transactions. Now in a distributed database system, we can hook an external global transaction manager and it can also be built inside the database core.
Although the two-phase commit (2PC) can provide the write consistency of transactions involving multiple servers, it does not ensure the atomic visibility that is also a must-have for a distributed ACID compliance database system.
What is atomic visibility?
Let us first look at what actually is the atomic visibility. The purpose of atomic visibility is to ensure that when a global transaction is started, The changes done by the other transactions committing at the same point time are either fully visible to this transaction or are totally invisible to it. Meaning that if a global transaction T1 makes changes on
SERVER-2 gets committed at the exact time when a global transaction
T2 was starting up. The transaction
T2 should either see the changes done by
T1 on both servers (
SERVER-2) or otherwise see none of the changes made by
T1 on any of the servers.
In MVCC (Multiversion concurrency control) methodology (used by PostgreSQL) the database system keeps the multiple copies of each data item, and each user connected to the database sees a snapshot of the database at a particular instant in time. With the snapshot isolation technique, a transaction observes a state of the data as when the transaction started and all the changes made by other concurrently running transactions are invisible. With a single node database implementing a snapshot is as simple as getting the list of in-progress-transactions, and a list of committed transactions at the time of the snapshot. But when we want to implement the snapshot isolation and atomic visibility in a distributed database system we require global snapshots. A synchronized database snapshot among all participating database nodes (In case of cross-node transactions), which ensures visibility and isolation with respect to other concurrent global and local transactions.
In an ideal world if we have all the clock of all the participating nodes synchronized at the atomic level, getting global snapshots would be as simple as local snapshots. All we require to do would be to broadcast the transaction start and end times to all participating nodes and that would be enough to calculate the visibility and provide proper data isolation. But unfortunately, the world we live in is not perfect and it’s hard to get the clock synchronization at the atomic level. So implementing global transactions is a little harder than they actually appear to be. And if we talk about PostgreSQL it is even more difficult to implement the global snapshot functionality using the currently implemented transaction and snapshot algorithms without hurting the performance.
There are various ways and methods we can use to implement global snapshots in PostgreSQL. One way would be to use a centralized transaction ID (XID) generator, as done by Postgres-XC. But that is not very robust in terms of performance.
Another approach could be to use commit sequence numbers (CSN) based snapshots and implement clock-SI for catering the time skew. The latter one is more promising in terms of adoption and performance but requires lots of effort to implement.
Where are we on these in the community?
There have been some patches submitted to the PostgreSQL hackers in recent times for implementing the CSN based snapshots, Global snapshots using the clock-SI approach and also for the built-in global transaction manager.
GTM and distributed transactions
The need and interest of companies working with PostgreSQL are growing each day to make PostgreSQL a fully working scale-out database system. We here at HighGo are also very keen to get these two features committed in the PostgreSQL core and are working on both atomic commit and atomic visibility features leveraging the work done by the authors.
Muhammad Usama is a database architect / PostgreSQL consultant at HighGo Software and also Pgpool-II core committer. Usama has been involved with database development (PostgreSQL) since 2006, he is the core committer for open source middleware project Pgpool-II and has played a pivotal role in driving and enhancing the product. Prior to coming to open source development, Usama was doing software design and development with the main focus on system-level embedded development. After joining the EnterpriseDB, an Enterprise PostgreSQL’s company in 2006 he started his career in open source development specifically in PostgreSQL and Pgpool-II. He is a major contributor to the Pgpool-II project and has contributed to many performance and high availability related features.