PostgreSQL’s MultiVersion Concurrency Control (MVCC) is an “advanced technique for improving database performance in a multi-user environment” according to Vadim Mikheev. This technique requires multiple “versions” of the same data tuple exist in the system governed by snapshots taken during different time periods. In other words, under such technique, it is PG’s responsibility to figure out how many tuples are
visible to the user and how many are not, according to multiple parameters such as snapshots taken, current transaction ID…etc. This is also known as
visibility check rules in PG and today in this blog, I will talk about the basic principles of visibility check so you get an idea how PG performs this task internally and I hope it would be helpful in your development work.
2. Parameters Involved?
- The tuple itself, which contains:
- xmin (the transaction ID that inserts this tuple)
- xmax (the transaction ID that deletes this tuple if > 0, otherwise it’s not deleted)
- cid (command ID)
- Global top transaction ID if available
- current snapshot, which also contains xmin, xmax and cid
- Commit Log Data (CLOG)
3. The Process
3.1 Check the Hintbit
The visibility process starts by checking the hintbit. If hintbit has a state of
COMMITTED, then we can skip most of the other visibility checking rules for better efficiency. With the tuple committed, PG then take this tuple’s xmin value to check with the current snapshot to ensure this tuple is not currently
in progress. (See 3.3 below for formula to determine
in progress). This check is required because there is a chance that this tuple, though committed, is still being updated by other backend processes at this very moment. If the tuple is not currently
in progress by other backends, it will finally check its
xmax value to ensure that it is invalid. Invalid means it has not been deleted. When all of the above are true, this tuple is considered visible.
If hintbit indicates
ABORTED, then this tuple is considered invisible to user. If hintbit has no value, then PG should continue with the rest of visibility check rules.
3.2 Check If Tuple’s xmin Equal to Global Top Transaction ID
The next check involves taking tuple’s xmin and compare with global top transaction ID to see if they are equal. This global top transaction ID will only be set when user starts the transaction manually with
begin clause. If user does not starts the transaction this way (no
begin clause is issued), then this global top transaction ID will not be set and therefore this checking will be skipped.
If tuple xmin equals global top transaction ID, this means the tuple is currently
in progress by the current backend, not others. This is where the command ID (cid) will be used to determine the visibility. Within a transaction block, each command issued has an associated command ID to indicate what command comes first and what comes later. For example, if a SELECT comes after an UPDATE, within a transaction, then the SELECT must see the new tuple updated by the previous UPDATE clause and cid plays a part in this determination. This behavior is also governed by the
isolation level which is unfortunately out of scope of this blog. The default isolation level is
READ COMMITTED, which makes the SELECT sees the data changed by previous UPDATE; but if the isolation is set to
REPEATABLE READ then the SELECT will not see the data changed by previous UPDATE. Please keep this in mind.
If tuple’s xmin does not equal to global top transaction ID, then PG should continue with the rest of visibility
3.3 Check the Current Snapshot
The next check involves the PG to check if this tuple is currently
in progress by other backend processes. This is done by comparing the tuple’s xmin value against current snapshot’s xmin and xmax values according to this formula.
- tuple xmin < snapshot xmin = not in progress
- tuple xmin >= snapshot xmax = in progress
If this tuple is considered
in progress by the snapshot, then this tuple must not be visible to the current user because other backends are still working on it. If this tuple is NOT considered
in progress, then PG should continue with the rest of visibility check rules.
3.4 Check the Commit Log (CLOG)
This next check involves the PG to take tuple’s xmin to check against the CLOG to see if this tuple has been committed or aborted. CLOG is like an array of transaction IDs and each array element stores a commit status. PG has a formula to convert a transaction ID into a CLOG block plus an offset to precisely access the right CLOG element. This CLOG data structure is regularly flushed to disk with checkpoint process and it is located in these 3 directories:
If CLOG says the given tuple is committed, then PG will continue to check the tuple’s
xmax value to ensure that it is invalid. Invalid means it has not been deleted and therefore it is visible to user. At the same time, PG will also update the hintbit value of this tuple to
COMMITTED such that in the next visibility checking, PG will not access CLOG again, which is rather expensive.
If CLOG says the given tuple is aborted or invalid, then this tuple is not visible to the current user. At the same time, it will also update the hintbit value of this tuple to
The above processes are the very basic visibility rules within PG according to my understanding. Of course there are other complex checking not mentioned here, such as the involvement of sub transactions and multi transactions, but they do follow somewhat similar pattern to what’s mentioned today. I hope this blog would be helpful for you.
Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.