First contact with the pg_filedump

Enterprise PostgreSQL Solutions

During my most recent expedition of going through PostgreSQL source code, I ran into some mysterious variables in the page structure of some indexes. I will explain later why i am calling these mysterious but first have a look at these variables…

typedef struct HashPageOpaqueData
{
...
uint16 hasho_page_id; /* for identification of hash indexes */
} HashPageOpaqueData;

or

typedef struct SpGistPageOpaqueData
{
...
uint16 spgist_page_id; /* for identification of SP-GiST indexes */
} SpGistPageOpaqueData;

You may find that in Postgres’s kernel code, they just give them a constant, but they don’t really use them anywhere (well as far as i have looked). That was my reason for calling these variables mysterious but finally found a clue in the code comment of the macro definition SPGIST_PAGE_ID.

The page ID is for the convenience of pg_filedump and similar utilities, which otherwise would have a hard time telling pages of different index types apart. 

So here comes my reason for writing this blog, this is about the pg_filedump utility, it will provide with an introduction to the utility and demo of basic functionality it provides.

Understanding and using pg_filedump

pg_filedump is a utility to format PostgreSQL heap/index/control files into a human-readable form. 

So without further ado, let’s dig into it by first operating it according to the introduction in the README to intuitively understand its functions:

Create test data:

CREATE TABLE test(id int, name text);

INSERT INTO test values(001, 'Aerith'), (002, 'Cloud'), (003, 'Tifa');

View file path:

SELECT pg_relation_filepath(‘test’);
pg_relation_filepath
———————-
base/14236/164333
(1 row)

Important: remember to execute checkpoint to ensure that the file contents are written to disk.

Using pg_filedump viewing files:

% pg_filedump $PGDATA/base/14236/164333

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: $PGDATA/base/14236/164333
* Options used: None
*******************************************************************

Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      36 (0x0024)
Block: Size 8192 Version    4           Upper    8072 (0x1f88)
LSN: logid      0 recoff 0x618242d8     Special  8192 (0x2000)
Items:    3                     Free Space: 8036
Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 36

<Data> -----
Item   1 -- Length:   35 Offset: 8152 (0x1fd8) Flags: NORMAL
Item   2 -- Length:   34 Offset: 8112 (0x1fb0) Flags: NORMAL
Item   3 -- Length:   33 Offset: 8072 (0x1f88) Flags: NORMAL


*** End of File Encountered. Last Block Read: 0 ***

You can see the header information of the page and you can see that there are 3 items in the data section, but unfortunately, the actual content of the data can not be read directly. So we add the – D option:

-D Decode tuples using given comma separated list of types

% pg_filedump -D int,text $PGDATA/base/14236/164333

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: $PGDATA/base/14236/164333
* Options used: -D int,text
*******************************************************************

Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      36 (0x0024)
Block: Size 8192 Version    4           Upper    8072 (0x1f88)
LSN: logid      0 recoff 0x618242d8     Special  8192 (0x2000)
Items:    3                     Free Space: 8036
Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 36

<Data> -----
Item   1 -- Length:   35 Offset: 8152 (0x1fd8) Flags: NORMAL
COPY: 1 Aerith
Item   2 -- Length:   34 Offset: 8112 (0x1fb0) Flags: NORMAL
COPY: 2 Cloud
Item   3 -- Length:   33 Offset: 8072 (0x1f88) Flags: NORMAL
COPY: 3 Tifa


*** End of File Encountered. Last Block Read: 0 ***

Now we can see the actual content, but if a tuple is deleted, will it be decoded here? To verify it, we try to execute an update statement:

UPDATE test SET name = 'Yuffie' WHERE id = 1;

Don’t forget checkpoint

Then we look at the results again:

% ./pg_filedump -D int,text $PGDATA/base/14236/164333

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: $PGDATA/base/14236/164333
* Options used: -D int,text
*******************************************************************

Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)
Block: Size 8192 Version    4           Upper    8032 (0x1f60)
LSN: logid      0 recoff 0x61824530     Special  8192 (0x2000)
Items:    4                     Free Space: 7992
Checksum: 0x0000 Prune XID: 0x0001b3a7 Flags: 0x0000 ()
Length (including item array): 40

<Data> -----
Item   1 -- Length:   35 Offset: 8152 (0x1fd8) Flags: NORMAL
COPY: 1 Aerith
Item   2 -- Length:   34 Offset: 8112 (0x1fb0) Flags: NORMAL
COPY: 2 Cloud
Item   3 -- Length:   33 Offset: 8072 (0x1f88) Flags: NORMAL
COPY: 3 Tifa
Item   4 -- Length:   35 Offset: 8032 (0x1f60) Flags: NORMAL
COPY: 1 Yuffie


*** End of File Encountered. Last Block Read: 0 ***

Hmmm, although the new tuple is updated here, the old tuple that should have been deleted also exists. So we added the -o option

-o Do not dump old values.

% ./pg_filedump -D int,text -o $PADATA/base/14236/164333

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: $PGDATA/base/14236/164333
* Options used: -D int,text -o
*******************************************************************

Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)
Block: Size 8192 Version    4           Upper    8032 (0x1f60)
LSN: logid      0 recoff 0x61824530     Special  8192 (0x2000)
Items:    4                     Free Space: 7992
Checksum: 0x0000 Prune XID: 0x0001b3a7 Flags: 0x0000 ()
Length (including item array): 40

<Data> -----
Item   1 -- Length:   35 Offset: 8152 (0x1fd8) Flags: NORMAL
tuple was removed by transaction #111527
Item   2 -- Length:   34 Offset: 8112 (0x1fb0) Flags: NORMAL
COPY: 2 Cloud
Item   3 -- Length:   33 Offset: 8072 (0x1f88) Flags: NORMAL
COPY: 3 Tifa
Item   4 -- Length:   35 Offset: 8032 (0x1f60) Flags: NORMAL
COPY: 1 Yuffie


*** End of File Encountered. Last Block Read: 0 ***

Great. Now the deleted tuple has been identified and the transaction number has been intimately prompted.

Let’s also take a look at the index page:

create index ON test using btree (id);

\d test
               Table "public.test"
Column | Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id     | integer |           |          |
name   | text    |           |          |
Indexes:
   "test_id_idx" btree (id)
   
select pg_relation_filepath('test_id_idx');
pg_relation_filepath
----------------------
base/14236/164337
(1 row)

checkpoint;
% ./pg_filedump -D int,text -o $PGDATA/base/14236/164337

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: $PGDATA/base/14236/164337
* Options used: -D int,text -o
*******************************************************************

Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      72 (0x0048)
Block: Size 8192 Version    4           Upper    8176 (0x1ff0)
LSN: logid      0 recoff 0x61845ee0     Special  8176 (0x1ff0)
Items:   12                     Free Space: 8104
Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 72

BTree Meta Data: Magic (0x00053162)   Version (4)
                  Root:     Block (1) Level (0)
                  FastRoot: Block (1) Level (0)

<Special Section> -----
BTree Index Section:
Flags: 0x0008 (META)
Blocks: Previous (0) Next (0) Level (0) CycleId (0)


Block    1 ********************************************************
<Header> -----
Block Offset: 0x00002000         Offsets: Lower      36 (0x0024)
Block: Size 8192 Version    4           Upper    8128 (0x1fc0)
LSN: logid      0 recoff 0x61845e50     Special  8176 (0x1ff0)
Items:    3                     Free Space: 8092
Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 36

<Data> -----
Item   1 -- Length:   16 Offset: 8160 (0x1fe0) Flags: NORMAL
tuple was removed by transaction #1048577
Item   2 -- Length:   16 Offset: 8144 (0x1fd0) Flags: NORMAL
tuple was removed by transaction #1048578
Item   3 -- Length:   16 Offset: 8128 (0x1fc0) Flags: NORMAL
tuple was removed by transaction #1048579

<Special Section> -----
BTree Index Section:
Flags: 0x0003 (LEAF|ROOT)
Blocks: Previous (0) Next (0) Level (0) CycleId (0)


*** End of File Encountered. Last Block Read: 1 ***

Cool, It shows the information in the index file in detail, and also shows the special section on the index page.

summary

Of course, the functions of pg_filedump are very rich. Here I just show the most basic usage. It can help us with a lot of things, such as:

In the learning stage, with the help of it, we can more easily grasp the content of the page, as well as the differences between different types of pages.

We can also use it to repair data, as mentioned in this slides — https://pgday.ru/presentation/146/59649714ee40a.pdf

More cool things, have a fun with it!