The mapping of oid and relfilenode in PG

Enterprise PostgreSQL Solutions

Comments are off

The mapping of oid and relfilenode in PG

A table in PostgreSQL has a relfilenode value, which specifies the file name of the table on disk (except foreign table and partition table). In general, this value can be found in the relfilenode field of the pg_class table, but there are some specific tables whose query result in the relfilenode field is 0. This blog will explore the kernel processing of these special tables relfilenode.

Relfilenode of ordinary table

When a normal table is created in PostgreSQL, the relfilenode value of the table is stored in pg_class system table. As it can be seen in this example that when a table is created, it’s OID and relfilenode value are both 16808. You can also find a file on the disk with the same name as the Relfilenode node value, in-fact the data inserted for the table is actually stored in the same file.

postgres=# create table t2(i int);
postgres=# select oid,relname,relfilenode from pg_class where relname = 't2';
  oid  | relname | relfilenode 
 16808 | t2      |       16808
(1 row)

postgres=# \q
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16808 
-rw-------+ 1 movead movead 0 12月 31 17:11 ../data/base/12835/16808

After we perform operations such as truncate, vacuum full, etc. on a table, the data in this table will be rewritten and the value of relfilenode for this table will be changed. The following test shows that after truncate, the relfilenode of the t2 table has changed from 16808 to 16811.

postgres=# truncate t2;
postgres=# select oid,relname,relfilenode from pg_class where relname = 't2';
  oid  | relname | relfilenode 
 16808 | t2      |       16811
(1 row)

postgres=# checkpoint;
postgres=# \q
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16808
ls: 无法访问'../data/base/12835/16808': 没有那个文件或目录
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16811
-rw-------+ 1 movead movead 0 12月 31 17:16 ../data/base/12835/16811

Relfilenode of Nail tables

postgres=# select oid, relname, relfilenode,reltablespace
from pg_class
where relfilenode = 0 and relkind = 'r'
order by reltablespace;
 oid  |        relname        | relfilenode | reltablespace 
 1247 | pg_type               |           0 |             0
 1255 | pg_proc               |           0 |             0
 1249 | pg_attribute          |           0 |             0
 1259 | pg_class              |           0 |             0
 3592 | pg_shseclabel         |           0 |          1664
 1262 | pg_database           |           0 |          1664
 2964 | pg_db_role_setting    |           0 |          1664
 1213 | pg_tablespace         |           0 |          1664
 1261 | pg_auth_members       |           0 |          1664
 1214 | pg_shdepend           |           0 |          1664
 2396 | pg_shdescription      |           0 |          1664
 1260 | pg_authid             |           0 |          1664
 6000 | pg_replication_origin |           0 |          1664
 6100 | pg_subscription       |           0 |          1664
(14 rows)


It can be seen from the above query that the relfilenode of these tables queried from the pg_class catalog is 0. Among them, pg_type, pg_proc, pg_attribute, and pg_class are non-shared tables, and they are called Nail tables in the kernel. The remaining tables are shared tables in the pg_global tablespace.

The purpose of relfilenode field in the pg_class table is to tell the user about the file name of a given table stored on the disk. For example, when we query the t2 table, the system must get the relfilenode from the pg_class catalog, then find the file on the disk, open it and scan it. But if we want to query the file name of the pg_class table on disk, where should we find its relfilenode? A set of functional interfaces is provided in PostgreSQL to convert oid and relfilenode.

postgres=# select pg_relation_filenode(1259);
(1 row)

postgres=# select pg_filenode_relation(0,16475);
(1 row)

postgres=# select pg_filenode_relation(0,16475)::oid;
(1 row)


By call pg_relation_filenode(), oid can be converted to relfilenode, and with pg_filenode_relation(), relfilenode can be converted to oid.

Since the corresponding relationship between oid and relfilenode for shared and nail tables are not stored in the pg_class table, how does PostgreSQL save this mapping relationship?


Nail table Relfilenode storage mechanism

After research, it is found that there are files in the data directory, as shown below.

movead@movead-PC:/h2/pgpgpg/data/base/12835$ ll 
-rw-------+ 1 movead movead 512 1231 15:10

movead@movead-PC:/h2/pgpgpg/data/global$ ll 
-rw-------+ 1 movead movead 512 1231 15:10

The mapping relationship between oid and relfilenode for shared tables is stored in file in the global directory, and the mapping relationship between oid and relfilenode for nail tables in the database with OID 12835 is stored in the 12835 directory.

The struct of the file is look like as blow:

typedef struct RelMapping
    Oid         mapoid;         /* OID of a catalog */
    Oid         mapfilenode;    /* its filenode number */
} RelMapping;

typedef struct RelMapFile
    int32       magic;          /* always RELMAPPER_FILEMAGIC */
    int32       num_mappings;   /* number of valid RelMapping entries */
    RelMapping  mappings[MAX_MAPPINGS];
    pg_crc32c   crc;            /* CRC of all above */
    int32       pad;            /* to make the struct size be 512 exactly */
} RelMapFile;


This blog mainly explains the two different manifestations of oid and relfilenode mapping in PostgreSQL. Just remember that using pg_relation_filenode() will always get the correct result, and query from the pg_class system table may get the wrong result.