Enterprise PostgreSQL Solutions

The smgr interface in PostgreSQL

1. Overview In my previous blog posts, I explained how the amazing buffer tag works in PostgreSQL and how to set up a shared storage using the Lustre network file system. In this blog, I will explain the storage interface provided by PostgreSQL and an idea to experience the storage interface, namely, smgr, the storage…
Read more

Serverless Database Research and Problems Everyone Wants to Solve

1.0 Introduction There are several solutions out there that can solve distributed database issues (such as Citus) and solutions out there that can solve high availability and database clustering issues (such as patroni). Yes, they do solve distributed and database cluster issues but at the same time make database maintenance and debugging more complicated. Very…
Read more

Large Object in PostgreSQL

1. Overview Sometimes, you may need to manage large objects, i.e. CLOB, BLOB and BFILE, using PostgreSQL. There are two ways to deal with large objects in PostgreSQL: one is to use existing data type, i.e. bytea for binary large object, and text for character-based large object; another is to use pg_largeobject; This blog will…
Read more

Global Unique Index ATTACH Support and Its Potential Deficiency

1.0 Introduction In my previous post here, I introduced the global unique index feature that my colleague, David, and I work together and explained how global unique index guarantees cross-partition uniqueness during CREATE. In this blog, I will explain how we implement cross-partition uniqueness with ATTACH and a potential deficiency in this approach. 2.0 Global…
Read more

Global Index, benchmark with pgbench

1. Overview Followed my previous blog, Global Index, a different approach, we posted our initial Global Unique Index POC to Postgres community for open discussion about this approach. Now, this blog is trying to explain how the benchmark was performed using pgbench on this initial Global Unique Index POC. 2. Build global index Before running…
Read more

Cross-partition Uniqueness Guarantee with Global Unique Index

1.0 Introduction My colleague, David, recently published a post “Global Index, a different approach” that describes the work that we are doing to implement global unique index in an approach that does not change current PostgreSQL’s partitioning framework, while allowing cross-partition uniqueness constraint. To implement this, we must first know how PostgreSQL currently ensures uniqueness…
Read more

Global Index, a different approach

1. Overview A few years ago, there was a proposal about adding the global index support to PostgreSQL for partitioned table. Following that proposal, there were many discussions and also an initial version POC to demonstrate the possibility, the technical challenges and the potential benefits, etc. However, the global index feature is still not available…
Read more

How Unique Index Works in PG

1.0 Introduction Recently I have been involved in creating a solution that guarantees cross-partition uniqueness within a partitioned table consisting multiple child tables. Some people refer to this feature as a global index and there has been some discussion about it in the email thread here. Though the idea is good, the approach sparks a…
Read more

How to Set Up NFS and Run PG on it

Introduction Network File System (NFS) is a distributed file system protocol that allows a user on a client node to access files residing on a server node over network much like local storage is accessed. Today in this blog, I will share how to set up both NFSv4 server and client on CentOS7 and run…
Read more

How to setup Lustre file system and run Postgres on it

1. Overview Similar to PostgreSQL, Lustre file system is also an open source project which started about 20 years ago. According to Wikipedia, Lustre file system is a type of parallel distributed file system, and is designed for large-scale cluster computing with native Remote Direct Memory Access (RDMA) support. Lustre file systems are scalable and…
Read more

Maintain a custom PG to PG Connection With libpq’s COPY protocol

1. Introduction Recently in my development work, a custom connection is required to be maintained between a PG backend on primary and another PG backends on standby nodes to communicate custom data in addition to the existing walsender/walreceiver connection that streams WAL data. Of course, I could just create a new standalone backend and maintain…
Read more

One idea of accessing Primary’s buffer blocks

1. Overview PostgreSQL is a great open source project for many reasons. One of the reasons I like it is because of the design of buffer blocks addressing. In this blog, I am going to explain a possible way to share a Primary’s buffer blocks with a Standby. If you want to know more about…
Read more

Understand PG’s MVCC Visibility Basic Check Rules

1. Introduction PostgreSQL’s MultiVersion Concurrency Control (MVCC) is an “advanced technique for improving database performance in a multi-user environment” according to Vadim Mikheev. This technique requires multiple “versions” of the same data tuple exist in the system governed by snapshots taken during different time periods. In other words, under such technique, it is PG’s responsibility…
Read more

How to do backup and restore for PostgreSQL using pg_rman

1. Overview PostgreSQL is a very popular open-source relational database management system, and it is widely used in many different production environments. To maintain the production environment always functioning, you need to a lot tools, and one of the tools must to have been backup and restore. This blog is going to introduce one backup…
Read more

Atomic Commit with 2 Phase Commit in FDW Distributed Setup

1. Introduction PostgreSQL’s 2 phase commit (2PC) feature allows a database to store the details of a transaction on disk without committing it. This is done by issuing PREPARE TRANSACTION [name] command at the end of a transaction block. When the user is ready to commit, he/she can issue COMMIT PREPARED [name] where [name] should…
Read more

Upcoming feature in PostgreSQL 15 – MERGE statement

The MERGE statement is one of the long awaited features and it’s coming in the upcoming Major version of PostgreSQL 15. Although the PostgreSQL 15 release is quite a distance away, the MERGE statement patch has been committed to the development branch and it should become available as early as beta release around May, 2022.…
Read more

global deadlock in a distributed database cluster

1. Overview Nowadays, supporting distributed transactions is a typical requirement for many use cases, however, the global deadlock detection is one of the key challenging issues if you plan to use PostgreSQL to setup a distributed database solution. There are many discussions about global deadlock, but this blog will provide you a step-by-step procedure about…
Read more

parallel commit in postgres fdw

1. Overview PostgreSQL is one of the greatest open source databases, not only because of the extensibility and SQL compliance but also the evolution of new features. For example, in postgres_fdw, there is a new feature parallel commit has been added into the main branch and will be released in PG15. This blog is for…
Read more

Distributed Database With PostgreSQL – Atomic Commit Problems

1. Introduction If you are into distributed database research, especially one that is setup using Foreign Data Wrapper (FDW) + partitioned foreign tables, you probably have heard that there are many potential issues associated with this setup. Atomic commit, atomic visibility and global deadlock detection are one of the most popular issues that one can…
Read more

A snippet to acquire a Lightweight lock

1. Overview Recently, I was working on an internal issue related with buffer manager in PostgreSQL, and I saw a typical use of the Lightweight lock in buffer manager like below. Basically, when the buffer manger needs to access a buffer block using buffer tag, it will have to acquire a lightweight lock in either…
Read more