Enterprise PostgreSQL Solutions

Backup Label in PostgreSQL

1. Overview When I was working on some backup and recovery related features for a project based on Postgres, I noticed that there is file called backup_label. By quickly google search, you can find some very nice blogs or books which discussed this topic, such as, The Internals of PostgreSQL, one of my favourite books.…
Read more

PostgreSQL 14 Continuous archiving and Point In Time Recovery

1. Introduction Recently I have been practicing the internals of PostgreSQL continuous archiving and the point in time recovery features that are required for my current development work. Today I would like to demonstrate these important features with the recently released PostgreSQL 14 on Ubuntu 18.04. 2. Write Ahead Log? Before going into the details…
Read more

A quick test for postgres_fdw batch insertion

1. Overview In my previous blog, I briefly walked through how the bulk/batch insertion was done for postgres_fdw in PG14. In this blog, I am going to run some basic tests to compare the performance for before and after the batch insertion was introduced in postgres_fdw, so that we can have a general idea about…
Read more

HG-PGSQL 1.8 and HG-PGSQL 2.4 Released (Sep 3, 2021)!

HighGo CA has released a routine update for its supported database systems on all supported platforms. HighGo Software Canada (a subsidiary of HighGo Software Inc) has released an update for HighGo Postgres Sever (HG-PGSQL 1.8) and HighGo Postgres Sever (HG-PGSQL 2.4).  The HG-PGSQL 1.8 update contains all the changes included in PostgreSQL community release version…
Read more

Implement Foreign Scan With FDW Interface API

1. Introduction Recently I have been tasked to familiarize myself with the Foreign Data Wrapper (FDW) interface API to build a new FDW capable of doing vertical / columnar sharding, meaning that the FDW is capable of collecting column information from multiple sources and combine them together as a result query. I will document and…
Read more

Tuning the Pgpool-II Load balancer

Load balancing increases the system throughput from the application standpoint by distributing the read load from application to multiple replicated database nodes. Some time ago, I wrote a blog discussing effectiveness of Pgpool-II load balancer in different scenarios. Pgpool-II provides one of the best user-friendly load balancers for PostgreSQL that does not require any modification…
Read more

How batch insertion was done in postgres_fdw

1. Overview postgres_fdw has been existing in PostgreSQL for many years, and it was one of the most popular interfaces used in many extensions. such as the PostgreSQL foreign data wrappers wiki page, and PostgreSQL Extension Network/PGXN. However, I had not touched this postgres_fdw in deep until I got a task about combing multiple columns…
Read more

Horizontal Scalability Options in PostgreSQL

So for one reason or another you have decided that you need horizontal scalability and you need to achieve it with PostgreSQL as your database. So what options do you have? The PostgreSQL database supports vertical scalability and can run on bigger and faster machines to increase the performance. But when it comes to horizontal…
Read more

Understanding the Security Around PostgreSQL

1. What Is Security? The word “Security” is a very broad concept and could refer to completely different procedures and methodology to achieve. Knowing what security means to your application is very important, so you could execute proper security practices and procedures to ensure the safety of your company’s assets. Data compromises could often lead…
Read more

The Amazing Buffer Tag in PostgreSQL

1. Overview I was working on the PostgreSQL storage related features recently, and I found PostgreSQL has designed an amazing storage addressing mechanism, i.e. Buffer Tag. In this blog, I want to share with you my understanding about the Buffer Tag and some potential usage of it. 2. Buffer Tag I was always curious to…
Read more

First contact with the pg_filedump

During my most recent expedition of going through PostgreSQL source code, I ran into some mysterious variables in the page structure of some indexes. I will explain later why i am calling these mysterious but first have a look at these variables… typedef struct HashPageOpaqueData{ … uint16 hasho_page_id; /* for identification of hash indexes */}…
Read more

Using GDB To Trace Into a Parallel Worker Spawned By Postmaster During a Large Query

1. Introduction I am working on a new PostgreSQL feature that redefines the way a tuple’s visibility status is determined. The feature is working very nicely until I start doing a large SELECT query, which triggers PostgreSQL to spawn multiple parallel workers to process the request. When this happens, the feature I am working on…
Read more

Some Interesting statistics about PG-14 contributions

I have spent a few days trolling through the features added to PostgreSQL 14 and in this blog, I want to share some statistics about PG-14 contributions, hopefully, you will find these statistics interesting. Please note that this data is based on my research on PG 14 contributions by going through the GIT LOG, commitfest…
Read more

Parallel execution of postgres_fdw scan’s in PG-14 (Important step forward for horizontal scaling)

Back in August 2019, I wrote about the Horizontal scaling / Sharding in PostgreSQL and where it is going https://www.highgo.ca/2019/08/08/horizontal-scalability-with-sharding-in-postgresql-where-it-is-going-part-3-of-3/. One of the key feature that i talked about in this blog is parallel foreign scan i.e. asynchronous append of FDW nodes. Finally the first patch of this feature was committed to PG-14, while they…
Read more

HG-PGSQL 1.7 and HG-PGSQL 2.3 Released

HighGo CA has released an update for its supported database systems on all supported platforms. HighGo Software Canada (a subsidiary of HighGo Software Inc) has released an update for HighGo Postgres Sever (HG-PGSQL 1.7) and HighGo Postgres Sever (HG-PGSQL 2.3).  The HG-PGSQL 1.7 update contains all the changes included in PostgreSQL community release version 12.7…
Read more

Getting started with pg_bulkload

  pg_bulkload is a high-speed data loading tool for PostgreSQL, compared with the copy command. The biggest advantage is speed. In the direct mode of pg_bulkload, it will skip shared buffer and WAL buffer, writing files directly. It also includes the data recovery function to recover in case of import failure. GitHub: https://github.com/ossc-db/pg_bulkload pg_bulkload mainly…
Read more

Hash Index Internals

Indexes are key database server features that enhance its performance to retrieve data faster than a complete table scan (unless the index is vastly bloated). They generally work by maintaining a smaller set of data in a more “searchable” structure as compared to the complete table data. This data organization comes at the cost of…
Read more

Calling Stored Procedure from JAVA – Part 2

Back in Nov 2021, I wrote about calling Stored procedure from Java and demonstrated how we can call a simple stored procedure from Java or a one that contains a IN parameter. I have done some more playing around with calling Stored procedure from Java so this blog is about calling a stored procedure with…
Read more

How does Postgres handle external parameters?

There are many advantages to using bind variables in SQL statements, especially in conditional statements. Using bind variables can save the trouble to parse and optimize the query in every execution. There are also disadvantages such as bind variable will make optimizer lost its prediction accuracy and cause performance decrease. In PostgreSQL are represented by…
Read more

How to run Hierarchical Queries with PostgreSQL

A hierarchical query is built upon parent-child relationship, the relationship exist in the same table or view. The relationship dictates that each child can have one parent while a parent can have many children. Hierarchical query is a SQL query that handles data of hierarchical model i.e. an organisation structure where every employee has one…
Read more