I will start with providing some explanation for the blog title before i dive into the blog itself. The title of this blog is that “Pgpool II 4.1 taking the bull by its horn”, the “bull” in this case is Pgpool II performance and “taking it by its horn” means that 4.1 is performance release and it aims to add significant performance improvement to Pgpool II.
The question that often comes up is the performance difference when working with PostgreSQL directly vs using middleware like Pgpool II in the configuration. It is intuitive that when we add a extra hop for connection pooling or load balancing or HA, it is going to give performance degradation in some cases when comparing with accessing PostgreSQL directly. The performance difference should be very minimal otherwise it beats the purpose. I am careful in saying some cases because you are using the load balancing feature of Pgpool with multi node configuration with streaming replication, the writes are going to the master and reads are getting distributed over stand-by nodes to get read performance/scalability. The particular case that we are going to discuss in this blog is more detail is doing large binary inserts with Pgpool and comparing it with doing the same operation directly with PG.
Pgpool II 4.1 is primary a performance release with major focus on increasing the Pgpool II performance and also reducing the gap between direct PG vs using Pgpool ll in the middle. The blog will provide insight into some of the performance features that will be delivered as part of the Pgpool II 4.1 release with more emphasis on the performance feature that speeds up the large binary insert.
So lets take a quick look at some of the performance features that will be part of Pgpool II 4.1 release, Pgpool II 4.1 GA will be released towards the end of the year somewhere around October/November time, I don’t have exact dates yet. Although 4.1 is primarily a performance release but it has other interesting things like getting compatible with postgres 12 parser and updates to documentation. The documentation has been a weak area for Pgpool II, 4.1 comes with some enhancements for Pgpool II documentation but not enough IMO.
Here are the list of performance features that will be part of Pgpool II 4.1 release :
Shared relation cache
Shared relation cache is very interesting performance feature, it is showing tremendous performance results with pgbench benchmarking. Tatsuo ishii has written a nice blog about shared relation cache in which he is also showing the benchmarking results with and without shared relation cache.
In a nutshell shared relation cache replaces the relation cache (relcache) and provided shared cache which is shared across all Pgpool II child nodes. Pgpool II needs to access a bunch of system catalog tables for the first time when he executes a simple query, with shared relational cache the information will be shared across all child nodes. So when a new node gets added it doesn’t need to query the system tables again.
There are other performance benefits for shared relation cache, please see the above blog for more details on the feature and benchmarking results.
Enhance Pgpool II performance for extended query
There has been some work to improve the performance of extended query protocol in Pgpool II releases prior to 4.1, the Pgpool II team has done number of commits for Pgpool II 4.1 in order to improve the performance of extended query. Here is brief explanation of the commit done in 4.1 to speed-up the extended query performance in Pgpool.
The first commit is about enhancing performance of CopyData message handling, the commit message below is self explanatory.
“When COPY XX FROM STDIN gets executed (typical client is pg_dump), each copy row data is sent from Pgpool-II to frontend using CopyData message. Previously, one CopyData message was followed by a flush, which costed a lot. Instead, now flush is done in subsequent Command Complete, Notice message or Error message. A quick test reveals that this change brings x2.5 speed up.”
The second commit is about enhance performance while sending message to frontend, the commit message below is self explanatory.
“SimpleForwardToFrontend(), which is responsible for sending message to frontend, does write buffering only if it is either ‘D’ (DataRow) or ‘d’ (CopyData). Other message types were immediately written to socket. But actually this was not necessary. So if the messages are not critical (“Command Complete”, “Ready For query”, + * “Error response” and “Notice message” messages), just write to buffer. With this 10-17% performance enhance was observed.”
The third commit is about Eliminate select(2) system calls when they are not necessary, the commit message below is self explanatory.
“The idea is checking select(2) timeout parameter set in a static variable in pool_read() and pool_read2(). If it’s -1, it means no select timeout will be set in pool_check_fd(), which implies we can avoid to call pool_check_fd(). Also I moved pool_check_fd() and friends to pool_stream.c from a modularity point of view. This gives slight performance improvement according to Jesper Pedersen.”
Statement level load balancing
I like this feature the most in Pgpool II 4.1 release, the current load balancing feature of Pgpool is connection or session based load balancing. This means that when a client session connects to a node (assuming we have cluster of primary of secondary nodes using streaming replication), the read queries for that session are sent to that node. When another client session connects, it gets connected to another node and queries are sent to that node for load balancing. The write queries are always going to the primary.
What i have described above is session level load balancing, what we have added to Pgpool II 4.1 is statement level load balancing. This feature is described in detail in the following blog post. In a nutshell the statement level load balancing is essentially load balancing at the query level i.e. whenever a new query is processed by Pgpool as appose to session level. The load balancing node is determined when a new query is issued. The new parameter for this is “statement_level_load_balance”. If this is set to on, the feature is enabled (the parameter can be changed by reloading the Pgpool.conf).
Speeding up the large binary inserts
This is another interesting user scenario of performing large binary inserts where we were seeing a big gap between direct PG and using Pgpool II in the mix. We did a simple test to see the difference between direct PG and Pgpool when the user is inserting large number binary data in a table. The sample.sql file used in the test below performs a large insert in a binary column, the test is first executed with PG and Pgpool II without the performance patch in Pgpool II 4.1, the second execution is direct with PG without Pgpool II, this is to show the difference between direct PG and Pgpool II. The third execution is with Pgpool II with the performance patch in Pgpool II 4.1.
The results speak for themselves, without the 4.1 performance patch the difference between Pgpool II and direct PG is as huge as more then 1100 %. After applying the performance patch, direct PG is only fractionally different then Pgpool II. This is a significant performance boost in Pgpool II 4.1 for this particular and important scenario.
I won’t go into too much details of how this is done in Pgpool II 4.1, there are couple of changes that resulted in achieving this performance gain. The main one is getting Pgpool II to use another process for managing the logs and second change is to introduce multiple parsers in Pgpool. The standard parser is used when Pgpool II needs to know all the information, however when we are operating in a master/slave configuration we don’t need any extra information for write queries since they are always sent to the primary node.
— large binary insert test with Pgpool without performance patch
postgres=# \i sample.sql
INSERT 0 1
Time: 3098.821 ms (00:03.099)
— large binary insert test with direct PG
postgres=# \i sample.sql
INSERT 0 1
Time: 246.685 ms
— large binary insert test with Pgpool with performance patch
postgres=# \i sample.sql
INSERT 0 1
Time: 289.275 ms
In this blog, I have described some of the major performance features that will part of Pgpool II 4.1 release. There are other performance features in 4.1 that aren’t mentioned in this blog to not make this blog very big. We will cover the remaining performance features in another blog. As mentioned above, Pgpool II 4.1 is primarily a performance release and it aims to reduce the gap between direct PG and pgpool II and also increase Pgpool II performance in general. I am sure we will have more performance work to do in future releases of Pgpool II but I believe that last few major releases of Pgpool II have done much in the area of performance, stability, resilience and HA.
Pgpool II might not be a solution for all your database needs but it is surely making strides to do really well in the are of load balancing and HA. We had a talk on Pgpool II and its capabilities in Postgres conference in Beijing in July 2019, I stressed one point in that talk the I want to repeat here. We hear complain about Pgpool II stability and issues with watchdog (Pgpool II component for HA) etc and in most cases we find that people are using very old version of Pgpool II which had issues in these areas. The Pgpool II team has added lot of new features and refactoring in last few releases and now Pgpool II is very stable, resilient and gives high performance. So I urge everyone to use the latest stable version of Pgpool II. I am not saying that all the issues are resolved, I also feel that we are weak on documentation or we need to find a solution where we can give users light weight versions of Pgpool II for example if someone just wants load balancing and not the other features etc. However I do strongly believe that it has come a long way in last few years and people should be trying the latest version of Pgpool II.