In the previous blog, we have discussed how to correctly set up streaming replication clusters between one master and one slave in Postgres version 12. In this blog, we will simulate a failover scenario on the master database, which causes the replica (or slave) database cluster to be promoted as new master and continue the operation. We will also simulate a failback scenario to reuse the old master cluster after the failover scenario with the help of pg_rewind.
Normally it is quite easy to do a failback to the old master after slave gets promoted to master but if there is data written to the old master after slave promotion, we will have an out-of-sync case between them both and we will have to use the pg_rewind tool to synchronize the two data directories in order to bring the old master to match the state of the new master. Please note that the pg_rewind tool will remove transactions from the old master in order to match up with the new, so certain pre-caution is needed to use this tool.
Here’s a brief overview of list of actions we are going to perform:
- simulate failover by promoting
slavecluster, so it becomes a
- simulate data insertion to
mastercluster, also referred as
old masterafter promotion
- shutdown the
old mastercluster and set it up as a standby server
- run pg_rewind on
old masterto synchronize transaction states with
- bring up
old masteras a standby server to synchronize with the
This blog assumes you already have streaming replication setup between one master and one slave from previous blog. If you have not checked out the previous blog titled “Streaming Replication Setup in PG12 – How to Do it Right”, it is recommended to give that a read first.
The procedures illustrated in this blog is based on Postgres version 12 built from source running on Ubuntu 18.04
2. Simulate a Failover Case
We will simply promote the slave database cluster to simulate a failover.
$ pg_ctl promote -D db-slave 2019-10-30 11:10:16.951 PDT  LOG: received promote request 2019-10-30 11:10:16.951 PDT  FATAL: terminating walreceiver process due to administrator command 2019-10-30 11:10:16.966 PDT  LOG: redo done at 0/3003B60 2019-10-30 11:10:16.991 PDT  LOG: selected new timeline ID: 2 2019-10-30 11:10:17.030 PDT  LOG: archive recovery complete 2019-10-30 11:10:17.051 PDT  LOG: database system is ready to accept connections
As seen above, After
slave gets promoted, it switches to a new timeline for future data operations. At this point the
slave are no longer streaming WAL files from each other and we essentialyl have two independent database clusters running. We will call them
old master and
new master in the following sections instead so it is clear.
3. Insert Some Data to the Old Master
We would like to create a data out-of-sync case by inserting some more data to the
old master cluster.
$ psql -d clusterdb -U cary -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y);" -p 5432 INSERT 0 100
Check that both the
old master and
new master are clearly out of sync:
## new master ## $ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5433 count ------- 200 (1 row) ## old master ## $ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5432 count ------- 300 (1 row)
4. Configure the Old Master as Standby Server to Sync with New Master
Now we would like to attempt a failover to make the
old master as a standy server to syncrhonize with the
Let’s shutdown the
old master cluster.
$ pg_ctl -D db-master stop waiting for server to shut down.... done server stopped
Let’s update postgresql.conf in the
############# db-master/postgresql.conf ############# recovery_target_timeline = 'latest' archive_cleanup_command = 'pg_archivecleanup /home/caryh/streaming-replication/archivedir %r' restore_command = 'cp /home/caryh/streaming-replication/archivedir/%f %p' primary_slot_name = 'main' primary_conninfo = 'user=cary passfile=''/home/caryh/.pgpass'' host=127.0.0.1 port=5433 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any'
primary_conninfo tells the
old master to stream WAL files from the
new master located at 127.0.0.1:5433.
Also, do not forget to touch the
standby.signal file to tell the cluster to run in standby mode:
We specified in the
old master to connect to
main. Let’s create the matching replication slot on the
$ psql -d clusterdb -U cary -c "select * from pg_create_physical_replication_slot('main');" -p 5433 slot_name | lsn -----------+----- main | (1 row) $ psql -d clusterdb -U cary -c "select * from pg_replication_slots;" -p 5433 -x -[ RECORD 1 ]-------+--------- slot_name | main plugin | slot_type | physical datoid | database | temporary | f active | f active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn |
new master has a matching replication slot called
main and it is not active at the moment. Now we are ready to start the
old master as standby server
5. Start the Old Master as Standby Server
Now, we are ready to start the
old master as a standby:
$ pg_ctl -D db-master start 2019-10-30 11:30:04.071 PDT  HINT: Future log output will go to log destination "syslog". 2019-10-30 11:30:04.075 PDT  LOG: database system was shut down at 2019-10-30 11:29:13 PDT 2019-10-30 11:30:04.079 PDT  LOG: restored log file "00000002.history" from archive 2019-10-30 11:30:04.082 PDT  LOG: entering standby mode 2019-10-30 11:30:04.084 PDT  LOG: restored log file "00000002.history" from archive 2019-10-30 11:30:04.095 PDT  FATAL: requested timeline 2 is not a child of this server's history 2019-10-30 11:30:04.095 PDT  DETAIL: Latest checkpoint is at 0/4000028 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/3003B98. 2019-10-30 11:30:04.096 PDT  LOG: startup process (PID 1611) exited with exit code 1 2019-10-30 11:30:04.096 PDT  LOG: aborting startup due to startup process failure 2019-10-30 11:30:04.098 PDT  LOG: database system is shut down
As you can see above, the
old master refuses to start because there is a timeline difference between the
old master and the
new master. This is caused by the additional data insertions that happens to the
old master after the promotion event in step number 3.
This is where pg_rewind comes handy in situation like this, to synchronize the two clusters.
6. Use pg_rewind to Synchronize the two Clusters
Now, let’s synchronize the two database clusters with pg_rewind.
$ pg_rewind --target-pgdata=db-master --source-server="port=5433 user=cary dbname=clusterdb" --progress pg_rewind: connected to server pg_rewind: servers diverged at WAL location 0/3003E58 on timeline 1 pg_rewind: rewinding from last common checkpoint at 0/2000060 on timeline 1 pg_rewind: reading source file list pg_rewind: reading target file list pg_rewind: reading WAL in target pg_rewind: need to copy 53 MB (total source directory size is 78 MB) 54363/54363 kB (100%) copied pg_rewind: creating backup label and updating control file pg_rewind: syncing target data directory pg_rewind: Done!
After pg_rewind is finished, we will have to edit once more the configuration of the
old master because the tool copies most of the configuration settings from the
new master to the
old master as a synchronization process. Let’s examine both
db-master/postgresql.conf and make sure of the followings again.
############# db-master/postgresql.conf ############# primary_slot_name = 'main' recovery_target_timeline = 'latest' port = 5432 ############# db-master/postgresql.auto.conf ############# primary_conninfo = 'user=cary passfile=''/home/caryh/.pgpass'' host=127.0.0.1 port=5433 sslmode=disable sslcompression=0 gssencmode=disable target_session_attrs=any'
and also, don’t forget about this:
Now, we should be ready to start the
old master again.
7. Start the Old Master Agian as Standby Server
$ pg_ctl -D db-master start 2019-10-30 12:27:28.140 PDT  LOG: restored log file "000000010000000000000002" from archive 2019-10-30 12:27:28.167 PDT  LOG: redo starts at 0/2000028 2019-10-30 12:27:28.182 PDT  LOG: consistent recovery state reached at 0/3027258 2019-10-30 12:27:28.183 PDT  LOG: invalid record length at 0/3027258: wanted 24, got 0 2019-10-30 12:27:28.183 PDT  LOG: redo done at 0/3027230 2019-10-30 12:27:28.183 PDT  LOG: last completed transaction was at log time 2019-10-30 12:20:34.056723-07 019-10-30 12:27:28.226 PDT  LOG: database system is ready to accept connections
old master can now start as a streaming replication to the
new master and we can observe that after using pg_rewind the additional data that was inserted to
old master in step number 3 is now removed, as it has been
rewound from 300 entries to 200 entries to match up with the
## new master ## $ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5433 count ------- 200 (1 row) ## old master ## $ psql -d clusterdb -U cary -c "SELECT count(*) from test_table" -p 5432 count ------- 200 (1 row)
In this blog, we have simulated a failover case and observe the effect of promoting a standby
slave server while more data insertions happening to the original
master server. We have demonstrated how to use
pg_rewind tool to synchronize both
slave after the
slave promotion. Though it results some data deletion at the original
master, in the end, we are able to resolve the timeline conflict with
pg_rewind and complete the database failover scenario.
Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.