1. Overview
In the previous blog, we have discussed how to correctly set up streaming replication clusters between one master and one slave in Postgres version 12. In this blog, we will simulate a failover scenario on the master database, which causes the replica (or slave) database cluster to be promoted as new master and continue the operation. We will also simulate a failback scenario to reuse the old master cluster after the failover scenario with the help of pg_rewind.
Normally it is quite easy to do a failback to the old master after slave gets promoted to master but if there is data written to the old master after slave promotion, we will have an out-of-sync case between them both and we will have to use the pg_rewind tool to synchronize the two data directories in order to bring the old master to match the state of the new master. Please note that the pg_rewind tool will remove transactions from the old master in order to match up with the new, so certain pre-caution is needed to use this tool.
Here’s a brief overview of list of actions we are going to perform:
- simulate failover by promoting
slave
cluster, so it becomes anew master
- simulate data insertion to
master
cluster, also referred asold master
after promotion - shutdown the
old master
cluster and set it up as a standby server - run pg_rewind on
old master
to synchronize transaction states withnew master
- bring up
old master
as a standby server to synchronize with thenew master
This blog assumes you already have streaming replication setup between one master and one slave from previous blog. If you have not checked out the previous blog titled “Streaming Replication Setup in PG12 – How to Do it Right”, it is recommended to give that a read first.
The procedures illustrated in this blog is based on Postgres version 12 built from source running on Ubuntu 18.04
2. Simulate a Failover Case
We will simply promote the slave database cluster to simulate a failover.
$ pg_ctl promote -D db-slave
2019-10-30 11:10:16.951 PDT [16643] LOG: received promote request
2019-10-30 11:10:16.951 PDT [16651] FATAL: terminating walreceiver process due to administrator command
2019-10-30 11:10:16.966 PDT [16643] LOG: redo done at 0/3003B60
2019-10-30 11:10:16.991 PDT [16643] LOG: selected new timeline ID: 2
2019-10-30 11:10:17.030 PDT [16643] LOG: archive recovery complete
2019-10-30 11:10:17.051 PDT [16642] LOG: database system is ready to accept connections
As seen above, After slave
gets promoted, it switches to a new timeline for future data operations. At this point the master
and slave
are no longer streaming WAL files from each other and we essentialyl have two independent database clusters running. We will call them old master
and new master
in the following sections instead so it is clear.
3. Insert Some Data to the Old Master
We would like to create a data out-of-sync case by inserting some more data to the old master
cluster.
$ psql -d clusterdb -U cary -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y);" -p 5432
INSERT 0 100
Check that both the old master
and new master
are clearly out of sync:
## new master ##
$ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5433
count
-------
200
(1 row)
## old master ##
$ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5432
count
-------
300
(1 row)
4. Configure the Old Master as Standby Server to Sync with New Master
Now we would like to attempt a failover to make the old master
as a standy server to syncrhonize with the new master
.
Let’s shutdown the old master
cluster.
$ pg_ctl -D db-master stop
waiting for server to shut down.... done
server stopped
Let’s update postgresql.conf in the old master
:
############# db-master/postgresql.conf #############
recovery_target_timeline = 'latest'
archive_cleanup_command = 'pg_archivecleanup /home/caryh/streaming-replication/archivedir %r'
restore_command = 'cp /home/caryh/streaming-replication/archivedir/%f %p'
primary_slot_name = 'main'
primary_conninfo = 'user=cary passfile=''/home/caryh/.pgpass'' host=127.0.0.1 port=5433 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any'
the primary_conninfo
tells the old master
to stream WAL files from the new master
located at 127.0.0.1:5433.
Also, do not forget to touch the standby.signal
file to tell the cluster to run in standby mode:
touch db-master/standby.signal
We specified in the old master
to connect to primary_slot_name
= main
. Let’s create the matching replication slot on the new master
.
$ psql -d clusterdb -U cary -c "select * from pg_create_physical_replication_slot('main');" -p 5433
slot_name | lsn
-----------+-----
main |
(1 row)
$ psql -d clusterdb -U cary -c "select * from pg_replication_slots;" -p 5433 -x
-[ RECORD 1 ]-------+---------
slot_name | main
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | f
active_pid |
xmin |
catalog_xmin |
restart_lsn |
confirmed_flush_lsn |
Now the new master
has a matching replication slot called main
and it is not active at the moment. Now we are ready to start the old master
as standby server
5. Start the Old Master as Standby Server
Now, we are ready to start the old master
as a standby:
$ pg_ctl -D db-master start
2019-10-30 11:30:04.071 PDT [1610] HINT: Future log output will go to log destination "syslog".
2019-10-30 11:30:04.075 PDT [1611] LOG: database system was shut down at 2019-10-30 11:29:13 PDT
2019-10-30 11:30:04.079 PDT [1611] LOG: restored log file "00000002.history" from archive
2019-10-30 11:30:04.082 PDT [1611] LOG: entering standby mode
2019-10-30 11:30:04.084 PDT [1611] LOG: restored log file "00000002.history" from archive
2019-10-30 11:30:04.095 PDT [1611] FATAL: requested timeline 2 is not a child of this server's history
2019-10-30 11:30:04.095 PDT [1611] DETAIL: Latest checkpoint is at 0/4000028 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/3003B98.
2019-10-30 11:30:04.096 PDT [1610] LOG: startup process (PID 1611) exited with exit code 1
2019-10-30 11:30:04.096 PDT [1610] LOG: aborting startup due to startup process failure
2019-10-30 11:30:04.098 PDT [1610] LOG: database system is shut down
As you can see above, the old master
refuses to start because there is a timeline difference between the old master
and the new master
. This is caused by the additional data insertions that happens to the old master
after the promotion event in step number 3.
This is where pg_rewind comes handy in situation like this, to synchronize the two clusters.
6. Use pg_rewind to Synchronize the two Clusters
Now, let’s synchronize the two database clusters with pg_rewind.
$ pg_rewind --target-pgdata=db-master --source-server="port=5433 user=cary dbname=clusterdb" --progress
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/3003E58 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/2000060 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 53 MB (total source directory size is 78 MB)
54363/54363 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
After pg_rewind is finished, we will have to edit once more the configuration of the old master
because the tool copies most of the configuration settings from the new master
to the old master
as a synchronization process. Let’s examine both db-master/postgresql.auto.conf
and db-master/postgresql.conf
and make sure of the followings again.
############# db-master/postgresql.conf #############
primary_slot_name = 'main'
recovery_target_timeline = 'latest'
port = 5432
############# db-master/postgresql.auto.conf #############
primary_conninfo = 'user=cary passfile=''/home/caryh/.pgpass'' host=127.0.0.1 port=5433 sslmode=disable sslcompression=0 gssencmode=disable target_session_attrs=any'
and also, don’t forget about this:
touch db-master/standby.signal
Now, we should be ready to start the old master
again.
7. Start the Old Master Agian as Standby Server
$ pg_ctl -D db-master start
2019-10-30 12:27:28.140 PDT [5095] LOG: restored log file "000000010000000000000002" from archive
2019-10-30 12:27:28.167 PDT [5095] LOG: redo starts at 0/2000028
2019-10-30 12:27:28.182 PDT [5095] LOG: consistent recovery state reached at 0/3027258
2019-10-30 12:27:28.183 PDT [5095] LOG: invalid record length at 0/3027258: wanted 24, got 0
2019-10-30 12:27:28.183 PDT [5095] LOG: redo done at 0/3027230
2019-10-30 12:27:28.183 PDT [5095] LOG: last completed transaction was at log time 2019-10-30 12:20:34.056723-07
019-10-30 12:27:28.226 PDT [5094] LOG: database system is ready to accept connections
The old master
can now start as a streaming replication to the new master
and we can observe that after using pg_rewind the additional data that was inserted to old master
in step number 3 is now removed, as it has been rewound
from 300 entries to 200 entries to match up with the new master
.
## new master ##
$ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5433
count
-------
200
(1 row)
## old master ##
$ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5432
count
-------
200
(1 row)
8. Summary
In this blog, we have simulated a failover case and observe the effect of promoting a standby slave
server while more data insertions happening to the original master
server. We have demonstrated how to use pg_rewind
tool to synchronize both master
and slave
after the slave
promotion. Though it results some data deletion at the original master
, in the end, we are able to resolve the timeline conflict with pg_rewind
and complete the database failover scenario.

Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.
Recent Comments