Transactions in PostgreSQL and their mechanism

Enterprise PostgreSQL Solutions

Comments are off

Transactions in PostgreSQL and their mechanism

Transaction is the most basic concept of a database. Using begin and end command in PostgreSQL can start and commit a transaction. Of course, this is the most common PostgreSQL transaction. In addition, there are sub transaction, multi transaction, 2pc transaction concepts in PostgreSQL. In this blog, I will demonstrate the emergence scenario and kernel implementation of these PostgreSQL transactions.

Normal Transaction

Using the PostgreSQL client to connect to the PostgreSQL server, transaction auto-commit is enabled by default, that is to say, every DML executed will automatically complete the commit process. You can use the \set AUTOCOMMIT off command to turn off auto-commit, or you can use the begin command to turn on a transaction block.

\echo :AUTOCOMMIT
create table t1(i int);
insert into t1 values(1);
rollback;
select * from t1;

\set AUTOCOMMIT off
\echo :AUTOCOMMIT
insert into t1 values(2);
rollback;
select * from t1;
insert into t1 values(22);
commit;
select * from t1;
commit;

\set AUTOCOMMIT on
\echo :AUTOCOMMIT
begin;
insert into t1 values(3);
rollback;
select * from t1;
begin;
insert into t1 values(33);
commit;
select * from t1;
postgres=# \echo :AUTOCOMMIT
on
postgres=# create table t1(i int);
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# rollback;
2020-06-08 15:15:50.261 CST [29689] WARNING:  there is no transaction in progress
WARNING:  there is no transaction in progress
ROLLBACK
postgres=# select * from t1;
 i 
---
 1
(1 row)

postgres=# 
-- When auto commit is on, the DML statement we executed has been successfully committed, and rollback has no impact on us
postgres=# \set AUTOCOMMIT off
postgres=# \echo :AUTOCOMMIT
off
postgres=# insert into t1 values(2);
INSERT 0 1
postgres=# rollback;
ROLLBACK
postgres=# select * from t1;
 i 
---
 1
(1 row)

postgres=# insert into t1 values(22);
INSERT 0 1
postgres=# commit;
COMMIT
postgres=# select * from t1;
 i  
----
  1
 22
(2 rows)

postgres=#
-- When auto commit is closed, all DML statements executed by us are rolled back or committed
postgres=# commit;
COMMIT
postgres=# 
postgres=# \set AUTOCOMMIT on
postgres=# \echo :AUTOCOMMIT
on
postgres=# begin;
BEGIN
postgres=# insert into t1 values(3);
INSERT 0 1
postgres=# rollback;
ROLLBACK
postgres=# select * from t1;
 i  
----
  1
 22
(2 rows)

postgres=# begin;
BEGIN
postgres=# insert into t1 values(33);
INSERT 0 1
postgres=# commit;
COMMIT
postgres=# select * from t1;
 i  
----
  1
 22
 33
(3 rows)

postgres=#
-- In the begin transaction block, the DML statement we execute can be rolled back or committed

Now we use the pageinspact tool to look at the data in the t1 table

postgres=# select lp,t_xmin,t_xmax,t_ctid,t_infomask,t_data from heap_page_items(get_raw_page('t1',0));
 lp | t_xmin | t_xmax | t_ctid | t_infomask |   t_data   
----+--------+--------+--------+------------+------------
  1 |    625 |      0 | (0,1)  |       2304 | \x01000000
  2 |    626 |      0 | (0,2)  |       2560 | \x02000000
  3 |    627 |      0 | (0,3)  |       2304 | \x16000000
  4 |    628 |      0 | (0,4)  |       2560 | \x03000000
  5 |    629 |      0 | (0,5)  |       2304 | \x21000000
(5 rows)
postgres=# select ctid,xmin,* from t1;
 ctid  | xmin | i  
-------+------+----
 (0,1) |  625 |  1
 (0,3) |  627 | 22
 (0,5) |  629 | 33
(3 rows)

postgres=#

Through the ‘pageinspact’ tool, you can see that the t1 table has 5 pieces of data, but the query result shows that t1 has 3 pieces of data. According to the above operations, transaction 626 and transaction 628 are rolled back. Therefore, the MVCC mechanism of PostgreSQL judges the commit status of transaction 626 and transaction 628, so the two records with t_ctid = (0,2) and t_ctid= (0,4) are not visible. We will not discuss the huge mechanism of MVCC here. Our focus is on how to obtain the commit status of each transaction in PostgreSQL.

There be pg_xact folderin pgdata directory of PostgreSQL:

movead@movead-PC:/h2/data/pg_xact$ ll
-rw-------+ 1 movead movead 8192 6月   8 15:23 0000
movead@movead-PC:/h2/data/pg_xact$

The file here records the commit status of the transaction. 2bit can record the status of a transaction, so a byte can record four transactions. Here, the size of each file is specified as 32 * BLCKSZ in the kernel. As shown above, the ‘0000’ file can hold transaction bewteen 1 and (32*BLCKSZ*4). Therefore, PostgreSQL can get the commit status of a transaction from pg_xact directory.

Subtransaction

The subtransaction is accompanied by a savepoint or the function with an exception. Let’s use savepoint as an example:

postgres=# truncate t1;
TRUNCATE TABLE
postgres=# select txid_current();
 txid_current 
--------------
        10495
(1 row)

postgres=# begin;
BEGIN
postgres=# select txid_current();
 txid_current 
--------------
        10496
(1 row)

postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# savepoint s1;
SAVEPOINT
postgres=# insert into t1 values(2);
INSERT 0 1
postgres=# savepoint s2;
SAVEPOINT
postgres=# insert into t1 values(3);
INSERT 0 1
postgres=# savepoint s3;
SAVEPOINT
postgres=# insert into t1 values(4);
INSERT 0 1
postgres=# select txid_current();
 txid_current 
--------------
        10496
(1 row)

postgres=# commit;
COMMIT
postgres=# select ctid,xmin,* from t1;
 ctid  | xmin  | i 
-------+-------+---
 (0,1) | 10496 | 1
 (0,2) | 10497 | 2
 (0,3) | 10498 | 3
 (0,4) | 10499 | 4
(4 rows)

postgres=#

We can see that the four records we inserted in the same transaction block have different transaction ids, and the first transaction 10496 is parent transaction and 10497,10498,10499 are subtransactions. Subtransaction is also a special common transaction,they also have commit status and will write the status to ‘pg_xact’ directory. After parent transaction commits or aborts, the behavior of these subtransactions are same with normal transactions.

The difference is the commit process, all parent and child transactions should be marked as commit or abort (the transaction in the middle of rollback can not be regarded as child transaction). To ensure the atomicity of the whole transaction, PostgreSQL designs a commit mechanism for the commit of the subtransaction.

subtrans_commit

1.Mark subtransaction status as sub-committed(TRANSACTION_STATUS_SUB_COMMITTED)

2.Mark parent transaction status as committed(TRANSACTION_STATUS_COMMITTED)

3.Mark subtransaction status as committed(TRANSACTION_STATUS_COMMITTED)

The method to determine whether a subtransaction is committed as below:

If the subtransaction status showed in pg_xact directory is sub-committed, then you need to go to find the parent transaction ID in pg_subtrans directory, and judge the commit status of the child transaction according to the commit status of parent transaction; If the subtransaction status showed in the pg_xact directory is ‘committed’, then the subtransaction is in committed state; In addition, subtransaction is uncommitted.

Finally, let’s take a look at the pg_subtrans directory that records the correspondence between subtransactions and parent transactions:

movead@movead-PC:/h2/data/pg_subtrans$ ll
-rw-------+ 1 movead movead 49152 6月   8 17:33 0000
movead@movead-PC:/h2/data/pg_subtrans$

In this file, it uses 4byte to record the parent transaction ID for every subtransaction. Here, the size of the files is specified as 32*BLCKSZ in the kernel, Each such file can hold 32**BLCKSZ/ 4 transactions

Multi-transaction

Multiple transactions occur because of row-level locks. When multiple sessions add row-level locks to the same row, multiple transactions occur. Below we use an example to demonstrate the emergence of multiple transactions.

-- Create a test table and insert data, and use the pageinspact tool to view the distribution of data on the disk
postgres=# create table t1(i int, j text);
CREATE TABLE
postgres=# insert into t1 values(1,'PostgreSQL');
INSERT 0 1
postgres=# insert into t1 values(2,'Postgres');
INSERT 0 1
postgres=# insert into t1 values(3,'pg');
INSERT 0 1
postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0));
 t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              
--------+-------------+------------+--------+--------+----------------------------------
 (0,1)  |           2 |       2050 |    500 |      0 | \x0100000017506f737467726553514c
 (0,2)  |           2 |       2050 |    501 |      0 | \x0200000013506f737467726573
 (0,3)  |           2 |       2050 |    502 |      0 | \x03000000077067
(3 rows)

postgres=#
-- Open a transaction block in a session, and then lock a row
postgres=# begin;
BEGIN
postgres=# select * from t1 where i = 2 for share;
 i |    j     
---+----------
 2 | Postgres
(1 row)

postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0));
 t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              
--------+-------------+------------+--------+--------+----------------------------------
 (0,1)  |           2 |       2306 |    500 |      0 | \x0100000017506f737467726553514c
 (0,2)  |           2 |        466 |    501 |    504 | \x0200000013506f737467726573
 (0,3)  |           2 |       2306 |    502 |      0 | \x03000000077067
(3 rows)
postgres=# select txid_current();
 txid_current 
--------------
          504
postgres=#

Here we can see that the t_xmax of t_ctid=(0,2) has changed.

-- In another session, do the same
postgres=# begin;
BEGIN
postgres=# select * from t1 where i = 2 for share;
 i |    j     
---+----------
 2 | Postgres
(1 row)

postgres=# select txid_current();
 txid_current 
--------------
          505
(1 row)

postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0));
 t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              
--------+-------------+------------+--------+--------+----------------------------------
 (0,1)  |           2 |       2306 |    500 |      0 | \x0100000017506f737467726553514c
 (0,2)  |           2 |       4562 |    501 |      2 | \x0200000013506f737467726573
 (0,3)  |           2 |       2306 |    502 |      0 | \x03000000077067
(3 rows)

postgres=#

We found that the t_xmax with t_ctid=(0, 2) becomes 2, this 2 is a multi-transaction, indicating that there are multiple transactions (504, 505) locked this line, whether multi-transaction 2 commit depends on t_infomask Value and (504,505) commit status. Let’s explore how the 2 and (504,505) in PostgreSQL are connected.

movead@movead-PC:/h2/data/pg_multixact$ tree .
.
├── members
│   └── 0000
└── offsets
    └── 0000

2 directories, 2 files
movead@movead-PC:/h2/data/pg_multixact$

There are files as shown above in the data directory, where the mapping relationship between multiple transactions and their corresponding transaction lists is stored. The attributes of transactions in the transaction list determine the commit status of multiple transactions.

mutitransaction

2PC Transaction

The two-phase commit (2PC) transaction is a necessary condition for implementing distributed commits. Similarly, the following will demonstrate what a 2PC transaction is and how PostgreSQL implements a 2PC transaction.

postgres=# begin;
BEGIN
postgres=# insert into t1 values(1,'PostgreSQL');
INSERT 0 1
postgres=# 
postgres=# select * from t1;
 i |     j      
---+------------
 1 | PostgreSQL
(1 row)

postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0));
 t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              
--------+-------------+------------+--------+--------+----------------------------------
 (0,1)  |           2 |       2050 |    537 |      0 | \x0100000017506f737467726553514c
(1 row)

postgres=# prepare transaction 'test_2pc_trans';
PREPARE TRANSACTION
postgres=# select * from t1;
 i | j 
---+---
(0 rows)

postgres=# commit prepared 'test_2pc_trans';
COMMIT PREPARED
postgres=# select * from t1;
 i |     j      
---+------------
 1 | PostgreSQL
(1 row)

postgres=#

Execute the above SQL in the same session, prepare transaction ‘test_2pc_trans’ is the first stage commit, commit prepared’test_2pc_trans’ is the second stage commit.

After the first-phase commit, we have exited the transaction block, and the transaction has not been fully committed, so we cannot query the data inserted in the transaction block. After the second-phase commit, the 2PC transaction has been completed, so the data is retrieved again.

2pc_commit

The 2PC transaction is also a special ordinary transaction. The judgment of the commit status of the 2PC transaction is the same as the normal transaction, except that the 2PC transaction has a ‘self-protection mechanism’: the 2PC transaction can survive independent of the session connection, even a database restart can not affect the 2PC transaction. In order to implement this protection mechanism, PostgreSQL will create a file to save transaction data for long-term 2PC transactions:

movead@movead-PC:/h2/data/pg_twophase$ ll
-rw-------+ 1 movead movead 252 6月   9 16:17 00000219
movead@movead-PC:/h2/data/pg_twophase$

The above file 00000219 is the storage file created for the 2PC transaction 537. If PostgreSQL restarts, it will load all 2PC transactions from the pg_twophase directory into memory.

Summary

In this blog, the use of PostgreSQL transactions, sub-transactions, multi-transactions, 2PC transactions and the characteristics of each transaction and its storage method are recorded from a relatively simple perspective.