Monitoring PostgreSQL with Nagios and Checkmk

Enterprise PostgreSQL Solutions

Comments are off

Monitoring PostgreSQL with Nagios and Checkmk

A key aspect of maintaining a high performance database is continuous monitoring of system resources, as any issues anywhere on the system can potentially lead to downtime. There are options aplenty for monitoring, so reinventing the wheel doesn’t make much sense. For that purpose, I’ve selected a couple of options that are simple to deploy, customizable, and can provide us with all the relevant information about the health of a PostgreSQL system.

This blog will focus on setting up Nagios for PostgreSQL monitoring before proceeding with enhancing that with Checkmk.

The Approach To Monitoring

As we speak about monitoring, a key part of it is the planning phase. There are key decisions to be made in context of the system to be monitored, available resources, and the response times.

Certain resource consumptions might be ephemeral, whereas some may not be that. Consider a query that massively involves the CPUs for a short duration versus a table that has grown over time and now results in filling in all the free space on disk. Similarly, some problems may need a DBA or system administrator to intervene whereas in some cases additional hardware resources may be required; e.g. running out of disk space.

Therefore, plan for resource monitoring whilst keep these indicators in perspective:

  • Warning thresholds,
  • Critical notification thresholds,
  • Resource requirements,
  • Response times.

Let’s look into some of the simple to deploy monitoring options available to us on Linux. This blog focuses on setting up these on a RHEL/CentOS 7 system.

Nagios

The pronunciation of the word Nagios can be disagreed upon, the word in actuality is an acronym for “Notices Any Glitch In Our System”. It can monitor applications, services, host resources and hardware as well. The open source project was actually renamed to Nagios Core, however, it is still commonly referred to as Nagios. The source is available on git with the GPL version 2.0 license.

Checkmk

Checkmk has multiple editions, however, we’ll only be focusing on Checkmk Raw Edition; CRE, as it is free and 100% opensource. Checkmk enhances Nagios by providing a much better interface and many plugins that can help monitor systems and services in more detail without any significant overhead.

Setting Up Nagios

The setup and configuration will be in context of RHEL/CentOS 7. For simplicity, we are going to host nagios and setup monitoring on the same system. If you are running this in a docker environment, remember to run it with enhanced privileges to allow systemd services to run. 

Install and Configure Nagios

The following command may help. It redirects docker’s port 80 to host port 8080:

docker run -t -d -u 0:0 --cap-add=NET_ADMIN -p 8080:80 --privileged=true centos:7 /usr/sbin/init

Following are the installation steps:

  1. Set selinux in permissive mode through its configuration file under /etc/selinux or via the “setenforce 0” command, if available.
  2. Install EPEL yum repository
  3. Install Nagios and its plugins
setenforce 0
yum install -y epel-release
yum install -y nagios nagios-plugins-all nagios-plugins-nrpe nrpe

With these packages along with their dependencies all installed, it’s time to configure Nagios.

  • Set password for nagiosadmin user
  • Enable services
  • Start services
htpasswd -c /etc/nagios/passwd nagiosadmin
systemctl enable httpd nagios nrpe
systemctl start httpd
systemctl start nagios
systemctl start  nrpe

After these steps, Nagios is up and running and can be access via a browser: http://localhost/nagios/ or on your host if you are running a docker: http://localhost:8080/nagios/

If you click on the “Hosts” link in the left-hand-side menu, you can see that it is already monitoring localhost. With Nagios functional now, let’s configure it to monitor PostgreSQL.

Install PostgreSQL

I’m running these commands with the root user. However, for a non-root user with appropriate privileges, prefix these commands with “sudo”.

# Install the repository RPM:
yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# Install PostgreSQL version 13:
yum install -y postgresql13-server

# Initialize the database, enable automatic start, and start the service:
/usr/pgsql-13/bin/postgresql-13-setup initdb
systemctl enable postgresql-13
systemctl start postgresql-13

The server is up and running now. You can check it via “systemctl status postgresql-13”.

Configure Nagios for PostgreSQL Monitoring

Add the following line in file /etc/nagios/nagios.cfg: “cfg_dir=/etc/nagios/conf.d”

Let’s create the folder for containing our configuration file. Any files ending with “.cfg” in this folder will be loaded by Nagios.

mkdir -p /etc/nagios/conf.d

Let’s create the PostgreSQL monitoring configuration file: /etc/nagios/conf.d/postgresql.cfg

define hostgroup {
    hostgroup_name pg-servers
    alias PostgreSQL Servers Status
}

define host {
    use			generic-host
    host_name		pg-1
    alias 			PostgreSQL Server 1
    address		127.0.0.1
    hostgroups		pg-servers
    max_check_attempts	10
    check_command	check-host-alive
}

define service {
    use				generic-service
    hostgroup_name		pg-servers
    service_description		Current Load
    check_command		check_nrpe_1arg!check_load
}

define service {
    use				generic-service
    hostgroup_name		pg-servers
    service_description		PostgreSQL Status
    check_command		check_pgsql
}

define command {
    command_name check_nrpe_1arg
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

define command {
    command_name check_pgsql
    command_line $USER1$/check_pgsql -H '$HOSTADDRESS$' '$ARG1$'
}

Time to restart/reload the Nagios services now: “systemctl reload nagios”

And voila, it’s all done! Ahem, not really. We haven’t really done the PostgreSQL configuration to allow the “nagios” user to access the PostgreSQL databases.

Configuring PostgreSQL for Nagios

In this particular, it’s a very simple two step process to allow the “nagios” user to access the database. Let’s create the database user “nagios” and give it permissions to access databases.

Creating the database user: “/usr/pgsql-13/bin/createuser nagios”

Adding the entry in “pg_hba.conf”:
“host    all             nagios          127.0.0.1/32            trust”

Note that this entry must precede any other configuration lines that define IPv4 connections for the “nagios” user.

Restart the PostgreSQL 13 service: “systemctl restart postgresql-13”. Now, it’s all done. You may monitor the status of the service via the Nagios web interface.

Nagios Interface for Monitoring

Setting Up Checkmk

Whilst Nagios provides a very basic interface, it can be significantly improved by setting up Checkmk. Checkmk provides detailed documentation, however, there were a couple of steps I had issues with, so I’m sharing those here for convenience.

You can pick up the latest Checkmk Raw Edition (CRE) installation RPM package for Redhat/CentOS 7:

https://checkmk.com/download?edition=cre&version=stable&dist=redhat&os=el7

You can install the RPM package via yum command:

yum install -y https://download.checkmk.com/checkmk/2.0.0/check-mk-raw-2.0.0-el7-38.x86_64.rpm

With the package installed and “omd” command now available, let’s create the monitoring site:

omd create postgresql_mon
omd start postgresql_mon

Next, we need to setup the agent. This was where I ran into issues. You can get a list of agents via your Checkmk URL followed by “/check_mk/agents/”:
“http://localhost:8080/postgresql_mon/check_mk/agents/”

You may pick up the noarch.rpm package and install it. Next, let’s PostgreSQL plugins for Checkmk by browsing to:
“http://localhost:8080/postgresql_mon/check_mk/agents/plugins/”

You must download the required plugin under and give execute permissions: “/usr/lib/check_mk_agent/plugins”.

With all this done, you may now proceed with the setting up the host through the Checkmk browser interface. For more details, please refer the documentation available here: https://docs.checkmk.com/

Checkmk Interface for Monitoring

Final Remarks

This blog is by no means a complete setup for monitoring PostgreSQL, however, it give you a basic setup to get started with. 

Checkmk enhances nagios web interface and also gives a significant boost to the default PostgreSQL monitoring through its plugins, however, you may also want to explore “check_postgres” tool. It is a perl script specifically designed to monitor PostgreSQL via Nagios, MRTG or custom monitoring scripts. check_postgres is provided by Burado. It has tons of monitoring options that be checked at: https://bucardo.org/check_postgres/check_postgres.pl.html