Recently, some people are asking why the wal of the database is consumed so quickly. This blog gives a simple way to analyze where the wal log is consumed, and you can get the reason for wal expansion.
Note: There are two understandings about wal expansion: 1. Too many LSNs are consumed in a period; 2. There are too many wal files in the $DATA/pg_wal directory. This blog is only to illustrate the first situation.
The pg_waldump is a built-in PostgreSQL tool, and we generally use it to simply view the content of wal. This tool can also be used to count the length of various wal types, as shown in the figure below:
Let’s take a brief look at how to understand this statistical result:
1.1 Command interpretation
./pg_waldump -p ../walback -s 0/A000000 -e 0/58000000 -stats=record
-p: Specify the directory that wal saves
-s: Specifies the starting LSN
-e: Specifies the end LSN
-stats=record: Specifies how the resolution results are rendered.
Of course, pg_waldump tool also has other command combinations, which can be studied if you are interested.
1.2 Interpretation of output results
The type column is divided into red and green boxes. The red box part is the classification of wal type, and the green box part is the specific wal log type. For example, under the category of heap, there are wal types such as insert, update, delete, etc. For details, please refer to the source code
Represents the number of times a wal type occurs, as shown in the figure, 18474 times of insert. The percentage column after n column indicates the proportion of this type of wal, such as 18474 / 486623 ≈ 3.8%.
③Record size column、FPI size column、Combined size column
The total length occupied by a certain wal type is displayed in the Combined size column, where the length of FPI data is in the FPI size column, and the length of non-FPI data is in the Record size column. Then follow the percentage of each of them.
For example, the wal type INSERT has a total length of 1503132, in which the length of FPI data is 40364, the length of non-FPI data is 1462768.
Note: FPI refers to the backup of a page in a wal record.
2-Statistical information analysis
From the above pictures, we can see that
MULTI_INSERT+INIT wal type occupied the most length of wal (955429108 ≈ 911m). Those who are familiar with this type of wal will know that using COPY to import a large amount of data will produce. Now we know the main composition data of pointed wal logs.
However, if you want to use this analysis method, you need to understand the meaning of various wal types. The most direct way to understand these is to learn the source code
rmgrlist.h. Here we introduce some wal types that may consume a lot of wal logs.
checksum for pages or
wal_log_hints parameter, a large number of such wal types will be generated and a lot of wal logs will be consumed.
When the index page is initialized and the table is reorganized, such as vacuum full or cluster, this wal type will be generated in large quantities.
This type usually appears when a large amount of data is imported, XLOG_HEAP2_MULTI_INSERT+INIT means that the page is initialized when data is inserted.
This type is generated when vacuum cleans up table garbage tuples.
Wal log generated when VM file of data page changes.
The most common type of wal, this is needless to say.
Index related wal logs can also lead to wal expansion.
The wal log generated by the sequence does not appear to cause the wal to expand.
RM_XACT_ID、RM_SMGR_ID、RM_CLOG_ID、RM_DBASE_ID、RM_TBLSPC_ID、RM_MULTIXACT_ID、RM_RELMAP_ID、RM_REPLORIGIN_ID、RM_GENERIC_ID、RM_LOGICALMSG_ID RM_COMMIT_TS_ID classification
Generally, it will not cause wal expansion too much.
Let’s run the pg_waldump and check the items.
Also, you may find that the statistical total size, such as 1302829755 in the above figure, may be very different from the total size of the wal section file you entered. I think, pg_waldump do not count wal consume caused by
select pg_switch_wal(), maybe a bug of pg_waldump. Therefore, if
archive_timeout is configured may cause some wal consumption not to be counted by pg_waldump tool.
In this blog, we propose a statistical method of wal log consumption, and propose some cases that will generate a lot of wal. I believe that after you understand the wal type and its proportion, you will know what kind of operations your database is carrying out in large quantities, and you can even adjust the database parameters according to these data to seek better performance.
Movead.Li is kernel development of Highgo Software. Since joining into Highgo Software in 2016, Movead takes the most time on researching the code of Postgres and is good at ‘Write Ahead Log’ and ‘Database Backup And Recovery’. Base on the experience Movead has two open-source software on the Postgres database. One is Walminer which can analyze history wal file to SQL. The other one is pg_lightool which can do a single table or block recovery base on base backup and walfiles or walfiles only.
Now he has joined the HighGo community team and hopes to make more contributions to the community in the future.