When using logical replication with PostgreSQL, the wal level needs to be set to ‘logical’, so the logical level wal contains more data to support logical replication than the replicate wal level. I think it’s unclear to many users or DBAs about the difference between logical and replicate level. This blog is about the difference of logical level and replicate level, which the reader can use to evaluate or analyze the impact of changing to logical level on wal log bloat.
1. Principle of Logical replication – the mission of Logical
The principle of logical replication is introduced in detail in this . In short words logical replication is to read the wal generated by DML statement, parse it into a tuple, and send it to the subscriber. The subscriber receives a tuple and updates the tuple accordingly. For example, parsing the wal record generated by an UPDATE statement produces a tuple-old and a tuple-new. The subscriber looks for the same row in the target table as the tuple-old and updates it as the tuple-new. Of course there’s only tuple-new for INSERT statement , and tuple-old for DELETE statement.
So the mission og logical level is help logical replicate to acquire tuple-new and tuple-old.
2. The way DML statements are recorded at different Wal levels
2.1 For tuple-new
INSERT, UPDATE, or COPY statements INSERT new rows on the data page. The data for the new row is typically recorded directly in the wal record generated by the statement. However, there is a special case where if the wal record also has an FPI, then the FPI is used in the wal record to record the data for the new row instead of recording it separately.
The current logical decode code does not support reading data from the FPI (only reading data from separate storage areas). This may be an existing improvement point, and perhaps the original author had more to think about. So in order for logical replication to get new data from wal, a new piece of data is redundantly stored under logical level.
When updating a record, only a few fields are typically updated. Therefore, PG prefers to record only the updated data when it is logging tuple-new at the REPLICATE level. The strategy for PG is to skip data that hasn’t changed before or after tuple-new. Assuming that we update a row (1,2,3,4) to (1,20,3,4), the replicate level is 4(20)8 for the newly generated row in wal. The meaning is that 20 is the changed data, and there are 4 bytes before 20 that are unchanged compared to tuple-old, and there are 8 bytes after 20 that are unchanged compared to tuple-old. This design can greatly reduce the size of the wal, thus reducing the expansion of the wal.
However, for logical replication, it may not be able to obtain an old-tuple, so it is impossible to concoct a complete tuple new from data such as 4(20)8, so the logical level needs to change. As shown in the figure below, complete tuple- New data is directly recorded.
2.2 For tuple-old
The REPLICATE level wal log doesn’t care what the exact data of tuple-old is, it just records the location of the tuple-old in the data page.
However, the actual data of tuple -old cannot be obtained according to CTID in logical replication. Therefore, logical replication requires the wal record to have the specific data of tuple-old.
Above we can see in the wal records the old data, but according to the configuration values according to different tables (NOTHING, DEFAULT, FULL, INDEX) abundance of data records are not the same.
DEFAULT： Record the primary key if it exists.
FULL：Record the whole tuple-old
INDEX：Record the index column.
3.1 Transaction Information
Logical level adds database OID, database default tablespace OID,2PC GID, and data source information to the wal generated by transaction commit.
3.2 Truncate Information
At the logical level, the truncate table operator will be recorded.
In this blog, I noted the differences between wal’s logical level and replicate level, and summarized the following differences:
How to handle tuple-new if there be a FPI in wal record
How to handle tuple-new for UPDATE record
Differences in the markup for the tuple-old
Other less important differences
In addition, some additional information has been added to the development version to send long transaction data to the subscriber in advance.
Movead.Li is kernel development of Highgo Software. Since joining into Highgo Software in 2016, Movead takes the most time on researching the code of Postgres and is good at ‘Write Ahead Log’ and ‘Database Backup And Recovery’. Base on the experience Movead has two open-source software on the Postgres database. One is Walminer which can analyze history wal file to SQL. The other one is pg_lightool which can do a single table or block recovery base on base backup and walfiles or walfiles only.
Now he has joined the HighGo community team and hopes to make more contributions to the community in the future.