Introduction
Following on my previous blog here, which outlines the procedure to deploy a metric server cron job to monitor an already deployed PostgreSQL primary and standby nodes on Kubernetes, this blog aims to show the procedure to deploy a pgpool node that is able to load balance write requests to the primary and read requests to 1 or more standby nodes.
We will focus on the deployment of pgpool today in this blog, not the Postgres primary and standby nodes, which are assumed to be already deployed. We will cover those in the next blog
The following is the architecture of the deployment:
We will write a manifest and a configmap yaml files to deploy pgpool and define a k8s service called pgpool-service
of type nodePort
that a psql client can access the pgpool from outside the cluster via port forwarding.
Prerequisite
You will need to install minikube that acts as a Kubernetes cluster to run the deployment, and kubectl to actually deploy.
Pgpool Deployment Manifest
pgpool-deploy.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pgpool
spec:
replicas: 1
selector:
matchLabels:
app: pgpool
template:
metadata:
labels:
app: pgpool
spec:
containers:
- name: pgpool
image: bitnami/pgpool:v4.4.1
imagePullPolicy: Always
#image: pgpool/pgpool
env:
- name: POSTGRES_USERNAME
valueFrom:
secretKeyRef:
name: mypostgres-postgres-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: mypostgres-postgres-secret
key: password
- name: PGPOOL_PASSWORD_ENCRYPTION_METHOD
value: "scram-sha-256"
- name: PGPOOL_ENABLE_POOL_PASSWD
value: "true"
- name: PGPOOL_SKIP_PASSWORD_ENCRYPTION
value: "false"
# The following settings are not required when not using the Pgpool-II PCP command.
# To enable the following settings, you must define a secret that stores the PCP user's
# username and password.
- name: PGPOOL_PCP_USER
valueFrom:
secretKeyRef:
name: pgpool-pcp-secret
key: username
- name: PGPOOL_PCP_PASSWORD
valueFrom:
secretKeyRef:
name: pgpool-pcp-secret
key: password
volumeMounts:
- name: pgpool-config
mountPath: /config
volumes:
- name: pgpool-config
configMap:
name: pgpool-config
---
apiVersion: v1
kind: Service
metadata:
name: pgpool-service
spec:
type: NodePort
selector:
app: pgpool
ports:
- name: pgpool-port
port: 9999
targetPort: 9999
nodePort: 30000
In pgpool-deploy.yml
, we define the PostgreSQL and Pgpool PCP credentials as secretKeyRef
kubernetes objects so their values are not visibly on the .yml
file. The secretKeyRef objects must be created prior to deployment as:
$ kubectl create secret generic mypostgres-postgres-secret --from-literal=username=cloud_admin --from-literal=password=cloud_admin
$ kubectl create secret generic pgpool-pcp-secret --from-literal=username=postgres --from-literal=password=postgres
The pgpool-deploy.yml
will fetch the secret values from the above created secretKeyRef and assign them to the associated environment variables for pgool container: POSTGRES_USERNAME, POSTGRES_PASSWORD, PGPOOL_PCP_USER, PGPOOL_PCP_PASSWORD
Pgpool configmap Manifest
pgpool-configmap.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: pgpool-config
labels:
name: pgpool-config
data:
pgpool.conf: |-
listen_addresses = '*'
port = 9999
socket_dir = '/var/run/pgpool'
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/var/run/pgpool'
backend_hostname0 = 'primary-service'
backend_port0 = 5432
backend_weight0 = 1
backend_flag0 = 'ALWAYS_PRIMARY|DISALLOW_TO_FAILOVER'
backend_hostname1 = 'standby-service'
backend_port1 = 5432
backend_weight1 = 1
backend_flag1 = 'DISALLOW_TO_FAILOVER'
sr_check_period = 0
sr_check_user = 'postgres'
health_check_period = 0
health_check_user = 'postgres'
backend_clustering_mode = 'streaming_replication'
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
connection_cache = on
load_balance_mode = on
ssl = on
enable_pool_hba = off
failover_on_backend_error = off
log_min_messages = warning
statement_level_load_balance = on
log_statement = on
log_per_node_statement = on
This configmap manifest will be populated into a pgpool.conf
file to be loaded by pgpool. Note that we predefined the primary node as primary-service
and standby node as standby-service
. These correspond to the 2 kubernetes services of type clusterIP
that brings traffic to postgres-primary
and postgres-standby
pods. Also note that we enabled statement_level_load_balancer
, which makes pgpool forward the write traffic to the primary node and read only traffic to the standby nodes.
Since postgres-standby
is deployed as a kubernetes deployment
, it can be scaled up or down (deploy more or less replica pods). When this happens, there is no need to update pgpool’s configmap to include new standby nodes or exclude the old nodes. This is because kubernetes will automatically distribute traffic among all available standby nodes
deploy pgpool
When we are ready, we can deploy them by:
$ kubectl create secret generic mypostgres-postgres-secret --from-literal=username=cloud_admin --from-literal=password=cloud_admin
$ kubectl create secret generic pgpool-pcp-secret --from-literal=username=postgres --from-literal=password=postgres
$ kubectl apply -f pgpool-configmap.yml
$ kubectl apply -f pgpool-deploy.yml
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
postgres-primary 1/1 Running 0 1m
postgres-replica-deployment-69f995884b-72v6g 1/1 Running 0 9s
postgres-replica-deployment-69f995884b-kkkn5 1/1 Running 0 9s
pgpool-59b77dbc76-6nr45 1/1 Running 0 1m
Observe Pgpool Load Balancing
Since we set log_statement = on
, we can use psql to run some queries on pgpool and observe the redistribution of the incoming queries.
psql -d postgres -p 30000 -h [$clusterip] -U postgres
postgres=# show pool_nodes;
node_id | hostname | port | status | pg_status | lb_weight | role | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+--------------------------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | primary-service | 5432 | up | unknown | 0.500000 | primary | unknown | 13 | false | 0 | | | 2023-06-16 18:31:43
1 | standby-service | 5432 | up | unknown | 0.500000 | standby | unknown | 11 | true | 0 | | | 2023-06-16 18:31:43
(2 rows)
postgres=# select count(*) from t1;
postgres=# select count(*) from t1;
postgres=# select count(*) from t1;
...
tail pgpool log to observe the load balancing between node id 1
and 2
, indicated by the DB node id:
value in the log
$kubectl logs pgpool-59b77dbc76-6nr45 -f
2023-06-16 18:31:51.804: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:15.520: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:15.521: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:24.776: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:24.776: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
2023-06-16 18:32:25.451: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:25.451: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:26.036: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:26.036: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:26.566: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:26.566: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:27.431: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:27.432: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:27.960: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:27.960: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
2023-06-16 18:32:28.511: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:28.511: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: select count(*) from t1;
2023-06-16 18:32:29.007: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:29.007: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
2023-06-16 18:32:35.295: psql pid 71: LOG: statement: insert into t1 values(generate_series(1,10000), '555');
2023-06-16 18:32:35.296: psql pid 71: LOG: DB node id: 0 backend pid: 40 statement: insert into t1 values(generate_series(1,10000), '555');
2023-06-16 18:32:36.397: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:36.398: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
2023-06-16 18:32:37.252: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:37.252: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
2023-06-16 18:32:38.052: psql pid 71: LOG: statement: select count(*) from t1;
2023-06-16 18:32:38.052: psql pid 71: LOG: DB node id: 1 backend pid: 34 statement: select count(*) from t1;
Reference
Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.
Recent Comments