The MERGE statement is one of the long awaited features and it’s coming in the upcoming Major version of PostgreSQL 15. Although the PostgreSQL 15 release is quite a distance away, the MERGE statement patch has been committed to the development branch and it should become available as early as beta release around May, 2022.
The MERGE statement is mostly used for adding, updating or removing rows from the target tables, based on the information and condition from a source, in a single transaction and using a single statement.
The most notable use is to synchronize two tables by updating, deleting or inserting new rows when there are differences found based on some join conditions.
What problem does it solve?
The MERGE statement is used to insert, update or delete rows in a specified table based on one or more source tables. It uses a condition to determine whether to update, delete or insert into the target table.
Let’s consider a small scale setup where customers and their order information is maintained. In this setup, perhaps one would like to perform some analysis of how many orders have been placed by a particular customer or how long has it been since the last placed order etc.
One would have to perform separate INSERT and UPDATE operations to sync up this information from a daily orders table to a consolidated customer’s information tables.
Syntax
[WITH with_query [, ...]] MERGE INTO [schema.]table [ [ AS ] alias ] USING { [schema.]table | [sub-query] } [ [ AS ] alias ] ON join_condition when_clause [...]; when_clause: { WHEN MATCHED [ AND condition ] THEN { merge_update | DELETE | DO NOTHING } | WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } merge_update: UPDATE SET { column_name = { expression | DEFAULT } | ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) } [, ...] merge_insert: INSERT [( column_name [, ...] )] [ OVERRIDING { SYSTEM | USER } VALUE ] { VALUES ( { expression | DEFAULT } [, ...] ) | DEFAULT VALUES }
Example
CREATE TABLE daily_orders ( order_id INT, customer_id INT, order_center VARCHAR, order_time timestamp ); CREATE TABLE customer_history ( customer_id INT, last_order_id INT, order_center VARCHAR, order_count INT, last_order timestamp ); INSERT INTO daily_orders (order_id, customer_id, order_center, order_time) VALUES (101, 30030, 'WX', now()); INSERT INTO daily_orders (order_id, customer_id, order_center, order_time) VALUES (102, 20045, 'CX', now()); INSERT INTO daily_orders (order_id, customer_id, order_center, order_time) VALUES (103, 20090, 'JX', now()); INSERT INTO daily_orders (order_id, customer_id, order_center, order_time) VALUES (104, 20010, 'AX', now()); INSERT INTO customer_history (customer_id, last_order_id, order_center, order_count, last_order) VALUES (40100, 097, 'MK', 10, '2019-09-15 08:13:00'); INSERT INTO customer_history (customer_id, last_order_id, order_center, order_count, last_order) VALUES (20453, 098, 'NU', 17, '2020-07-12 01:05:00'); INSERT INTO customer_history (customer_id, last_order_id, order_center, order_count, last_order) VALUES (20090, 099, 'JX', 03, '2020-09-11 03:23:00'); INSERT INTO customer_history (customer_id, last_order_id, order_center, order_count, last_order) VALUES (21495, 100, 'DS', 11, '2022-05-21 04:12:00'); INSERT INTO customer_history (customer_id, last_order_id, order_center, order_count, last_order) VALUES (20010, 103, 'AX', 05, '2021-01-17 19:53:00'); select * from customer_history; postgres=# select * from customer_history; customer_id | last_order_id | order_center | order_count | last_order -------------+---------------+--------------+-------------+-------------------- 40100 | 97 | MK | 10 | 2019-09-15 08:13:00 20453 | 98 | NU | 17 | 2020-07-12 01:05:00 20090 | 99 | JX | 3 | 2020-09-11 03:23:00 21495 | 100 | DS | 11 | 2022-05-21 04:12:00 20010 | 103 | AX | 5 | 2021-01-17 19:53:00 (5 rows) MERGE INTO customer_history c USING daily_orders d ON (c.customer_id = d.customer_id) WHEN MATCHED THEN UPDATE SET -- Existing customer, update the order count and the timestamp of order. order_count = c.order_count + 1, last_order_id = d.order_id WHEN NOT MATCHED THEN -- New entry, record it. INSERT (customer_id, last_order_id, order_center, order_count, last_order) VALUES (customer_id, d.order_id, d.order_center, 1, d.order_time); select * from customer_history; postgres=# select * from customer_history; customer_id | last_order_id | order_center | order_count | last_order -------------+---------------+--------------+-------------+-------------------- 40100 | 97 | MK | 10 | 2019-09-15 08:13:00 20453 | 98 | NU | 17 | 2020-07-12 01:05:00 21495 | 100 | DS | 11 | 2022-05-21 04:12:00 30030 | 101 | WX | 1 | 2022-04-30 18:20:33 20045 | 102 | CX | 1 | 2022-04-30 18:20:33 20090 | 103 | JX | 4 | 2020-09-11 03:23:00 20010 | 104 | AX | 6 | 2021-01-17 19:53:00 (7 rows)
As one can see that two rows have been added to the target table customer_history with customer_id 30030 and 20045. While two other rows with customer_id 20090 and 20010 have been updated with new information.
The MERGE statement feature has been committed and is part of the development branch. Its documentation is available at https://www.postgresql.org/docs/devel/sql-merge.html
Asif Rehman is a Senior Software Engineer at HighGo Software. He Joined EnterpriseDB, an Enterprise PostgreSQL’s company in 2005 and started his career in open source development particularly in PostgreSQL. Asif’s contributions range from developing in-house features relating to oracle compatibility, to developing tools around PostgreSQL. He Joined HighGo Software in the month of Sep 2018.
Recent Comments