1. Overview
pg_basebackup is a powerful tool for creating physical backups of PostgreSQL database clusters. Unlike pg_dump, which generates logical backups, pg_basebackup captures the entire cluster state. These backups are crucial for point-in-time recovery or for setting up a standby server.
2. Backup Compression
Efforts to enhance backup performance have led to innovations like parallel processing and the integration of various compression algorithms. Starting from PG version 15, a new compression option enables users to specify where compression should occur. For example,
pg_basebackup -h localhost -D bak1 -Ft --compress=server-gzip:9
Here, the PostgreSQL server handles compression before transferring data to the client. This setup is ideal when network bandwidth is a limiting factor, but the server has ample processing capacity.
Alternatively, you can shift the compression load to the pg_basebackup client using:
pg_basebackup -h localhost -D bak2 -Ft --compress=client-gzip:9
This approach minimizes CPU consumption on the server side but demands more network bandwidth.
To experience, I conducted a speed test by creating a table and inserting 100 million records. The results showed no significant difference in performance as I run server and client on the same machine.
Commands Used:
psql -d postgres
postgres=# CREATE TABLE t(key int, value text);
CREATE TABLE
postgres=# insert into t values(generate_series(1, 100000000), 'hello world');
INSERT 0 100000000
### Compression on PG Server Side:
time pg_basebackup -h localhost -D bak1 -Ft --compress=server-gzip:9
real 8m44.789s
user 0m0.069s
sys 0m0.481s
### Compression on pg_basebackup Side:
time pg_basebackup -h localhost -D bak2 -Ft --compress=client-gzip:9
real 8m40.868s
user 8m40.040s
sys 0m0.672s
3. Summary
The flexibility of pg_basebackup, with its array of parameters and compression options, allows you to fine-tune backups to meet your specific needs. Thorough testing and experimentation will help identify the optimal configuration for your daily backup operations.

A software developer specialized in C/C++ programming with experience in hardware, firmware, software, database, network, and system architecture. Now, working in HighGo Software Inc, as a senior PostgreSQL architect.
Recent Comments