High Availability PostgreSQL with Patroni and HAProxy

Published:13 December 2023 - 12 min. read

Arvid Larson Image

Arvid Larson

Read more tutorials by Arvid Larson!

Facing the challenge of maintaining a resilient PostgreSQL database in the ever-changing landscape of digital demands? If the term “Patroni” resonates with your tech aspirations, you’re on the verge of unlocking a solution that elevates PostgreSQL into a fortress of high availability.

Certainly, a challenge awaits, but fear not – you’re embarking on a transformative journey to constructing an unwavering PostgreSQL cluster. Envision a future where database disruptions are mere blips, and your PostgreSQL setup stands as the epitome of reliability.

Get ready to fortify your PostgreSQL environment into an unshakeable bastion!

Prerequisites

Before you jump on implementing high availability for PostgreSQL, ensure that you have the following in place:

  • Five (or more) Linux servers – This tutorial uses Debian 12 servers, each with a non-root user with sudo/administrator privileges as follows:
HostnameIP AddressUsed as
postgres01192.168.5.20PostgreSQL Server
postgres02192.168.5.21PostgreSQL Server
postgres03192.168.5.22PostgreSQL Server
etcd192.168.5.15Cluster Data Store
haproxy192.168.5.16Load Balancer

Installing PostgreSQL Server and Patroni

With all the prerequisites met, picture this moment as laying the foundation for a robust, fail-safe database environment. The goal is to create a high-availability PostgreSQL deployment via PostgreSQL 15. But first, you must install the required packages (PostgreSQL Server and Patroni) on all your PostgreSQL servers.

Patroni is a Python-based application for creating high-availability PostgreSQL deployment in your data centers, from bare-metal to Kubernetes. Patroni is available on the official PostgreSQL repository and supports PostgreSQL servers 9.5-16.

To install PostgreSQL Server and Patroni, carry out the following:

💡 NOTE: Complete the following operations on PostgreSQL servers. In this case, postgres01, postgres02, and postgres03.

1. Open a terminal and run the curl command below, which has no output but adds the GPG key for the PostgreSQL repository to /usr/share/keyrings/pgdg.gpg.


💡 This tutorial uses a root account for command execution to ensure demonstration compatibility. But remember that employing a non-root account with sudo privileges is highly advisable. With a non-root account, you must prefix your commands with sudo for enhanced security and best practices.

curl -fsSL <https://www.postgresql.org/media/keys/ACCC4CF8.asc> | sudo gpg --dearmor -o /usr/share/keyrings/pgdg.gpg

2. Next, execute the following command, which does not provide output but adds the PostgreSQL repository to the list of package sources in the /etc/apt/sources.list.d/pgdg.list file.

sh -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/pgdg.gpg] <https://apt.postgresql.org/pub/repos/apt> $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

3. Once added, run the below apt update command to refresh your package index and retrieve newer package information.

apt update
Adding PostgreSQL repository and updating package index
Adding PostgreSQL repository and updating package index

4. Once the updated, execute the following command to install the following packages:

  • postgresql-15 – The PostgreSQL database management system version 15.
  • patroni – An open-source solution for high availability in PostgreSQL, a template for PostgreSQL HA clusters using Python and etcd.
  • python3-etcd – A Python client library for interacting with etcd, a distributed key-value store. This library allows Python applications to communicate with and manage etcd clusters.
  • python3-psycopg2 – A PostgreSQL adapter for Python 3, connecting Python applications and PostgreSQL databases.
apt install postgresql-15 postgresql-server-dev-15 patroni python3-etcd python3-psycopg2

Input Y to proceed with the installation when prompted.

Installing packages including PostgreSQL 15 and Patroni
Installing packages including PostgreSQL 15 and Patroni

5. With the packages installed, run each command below, which has no output to the terminal but performs the following:

  • Stop the postgresql and patroni services. On Debian/Ubuntu, the postgresql and patroni services are automatically started after installation.
  • Create a symlink for PostgreSQL binary files to the /usr/sbin directory. Doing so ensures patroni can execute PostgreSQL binary files for creating and managing PostgreSQL.
systemctl stop postgresql patroni
ln -s /usr/lib/postgresql/15/bin/* /usr/sbin/

6. Lastly, execute the following commands to verify the binary path for patroni and psql, and installed --version of patroni.

which patroni
which psql
patroni --version

Below are the paths to the patroni (/usr/bin/patroni) and psql (/usr/sbin/psql) binary files; the Patroni version installed is 3.2.1.

 Verifying binary paths for Patroni and psql
Verifying binary paths for Patroni and psql

Setting up and Configuring the etcd Server

Now that you have PostgreSQL Server and Patroni installed, you need a backbone that solidifies the coordination among your PostgreSQL servers for seamless high availability. You’ll set up and configure etcd, a key-value data store.

This key-value data store is the silent architect behind the scenes, ensuring data related to your PostgreSQL cluster deployment is securely stored and efficiently managed.

💡 NOTE: Be sure to install etcd on a separate server. In this example, etcd is installed on the etcd server.

To install and configure etcd, follow these steps:

1. On your etcd server, execute the command below to update the repository index and get the latest package information.

apt update
Updating the Debian repository on the etcd server
Updating the Debian repository on the etcd server

2. Next, run the below command to install the etcd on your server.

apt install etcd-server
Installing etcd
Installing etcd

3. Once you’ve installed etcd, open the default configuration /etc/default/etcd using your preferred editor, and insert the following configuration.

This configuration sets up a single etcd cluster, so ensure you change the IP address 192.168.5.15 with your internal IP address.

ETCD_LISTEN_PEER_URLS="<http://192.168.5.15:2380>"
ETCD_LISTEN_CLIENT_URLS="<http://localhost:2379>,<http://192.168.5.15:2379>"
ETCD_INITIAL_ADVERTISE_PEER_URLS="<http://192.168.5.15:2380>"
ETCD_INITIAL_CLUSTER="default=http://192.168.5.15:2380,"
ETCD_ADVERTISE_CLIENT_URLS="<http://192.168.5.15:2379>"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new" 

Save the changes, and exit the editor.

Configuring an etcd single cluster mode
Configuring an etcd single cluster mode

4. Now, run the systemctl command below to restart etcd and apply your modifications.

This command has no output, but you’ll verify the changes in the following step.

systemctl restart etcd

5. Once the etcd restarts, verify the etcd service is running and enabled.

systemctl is-enabled etcd
systemctl status etcd

If the etcd service is running, you should see an output active (running). When enabled, you will see the output enabled, which also means etcd will start automatically at boot.

Checking the etcd service status
Checking the etcd service status

6. Lastly, run the following etcdctl command below to verify the list of available servers on the etcd cluster.

etcdctl member list

In this case, etcd runs as a single-node cluster on a local IP address http://192.168.5.15:2379/.

Checking cluster members on etcd
Checking cluster members on etcd

Bootstrapping PostgreSQL Cluster via Patroni

With the etcd server now firmly in place, you stand at the threshold of the next crucial phase. By initiating the bootstrapping process using Patroni, you elevate your PostgreSQL configuration into a robust and fault-tolerant cluster.

💡 Be sure to backup your database first if you’re deploying a PostgreSQL cluster on an existing PostgreSQL server.

To bootstrap your PostgreSQL cluster via Patroni, perform the following on each PostgreSQL server:

1. Open Patroni’s default configuration (/etc/patroni/config.yml) in your text editor, and add the following configuration.

Ensure you replace the name option’s value with the hostname of your PostgreSQL server (i.e., postgres01), but do not close the editor yet.

This configuration sets up your PostgreSQL cluster named postgres.

# Scope of PostgreSQL
scope: postgres
# Namespace for the PostgreSQL database
namespace: /db/
# Name of the PostgreSQL instance
name: postgres01

2. Next, add the configuration below to set up the Patroni REST API to run at 192.168.5.20:8008.

You must ensure each of the PostgreSQL servers within the cluster can connect via API. Thus, change the IP address 192.168.5.20 with the respective IP address of each PostgreSQL server.

# Patroni REST API Configuration
restapi:
    # The IP address and port on which the REST API should listen
    listen: 192.168.5.20:8008

    # The IP address and port to which clients should connect
    connect_address: 192.168.5.20:8008

3. Add the configuration below to enable integration with etcd. In this case, the etcd server is running on IP address 192.168.5.15.

# Patroni Etcd Configuration
etcd3:
    # The host address and port of the Etcd server
    host: 192.168.5.15:2379

4. Now, add the configuration below to bootstrap the PostgreSQL server via initdb.

This configuration sets up the default rules and settings for client authentication (pg_hba.conf) and a new user admin with the password admin.

Be sure to input the IP addresses of the PostgreSQL cluster to the pg_hba section and change the default admin password in the users section.

# Patroni Bootstrap Configuration
bootstrap:
    # Configuration parameters for distributed configuration store (DCS)
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            # Use pg_rewind during bootstrap
            use_pg_rewind: true

    # Initdb configuration
    initdb:
        - auth: scram-sha-256
        - encoding: UTF8
        - data-checksums

    # pg_hba.conf entries for replication and general access
    pg_hba:
        - host replication replicator 127.0.0.1/32 scram-sha-256
        - host replication replicator 192.168.5.20/0 scram-sha-256
        - host replication replicator 192.168.5.21/0 scram-sha-256
        - host replication replicator 192.168.5.22/0 scram-sha-256
        - host all all 0.0.0.0/0 scram-sha-256

		# Adding default user admin with password admin
		users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

5. After configuring how PostgreSQL starts, insert the following configuration to set up how PostgreSQL runs on each server.

As for the postgres01 server, the PostgreSQL will run on IP address 192.168.5.20 with the data directory /var/lib/patroni.

In addition, this configuration creates a new user called replicator for replication operation, and the postgres user as superuser/administrator with the password (secretpassword).

Ensure you change the IP address and the default password (secretpassword).

# Patroni PostgreSQL Configuration
postgresql:
    # PostgreSQL server listening address and port
    listen: 192.168.5.20:5432
    # Connect address for PostgreSQL clients
    connect_address: 192.168.5.20:5432
    # Data directory for PostgreSQL
    data_dir: /var/lib/patroni
    # Path to the pgpass file
    pgpass: /tmp/pgpass

    # Authentication configuration
    authentication:
        # Replication of user credentials
        replication:
            username: replicator
            password: rep-pass
        # Superuser credentials
        superuser:
            username: postgres
            password: secretpassword

    # Additional PostgreSQL parameters
    parameters:
        # Directory for Unix socket
        unix_socket_directories: '.'
        # Password encryption method
        password_encryption: 'scram-sha-256'

6. Insert the following configuration to set up tags for your PostgreSQL server that determine its behaviors on the cluster, save the changes, and close the file.

# Patroni Tags Configuration
tags:
    # Prevents a node from being promoted in case of failure
    nofailover: false
    # Prevents the load balancer from considering this node
    noloadbalance: false
    # Prevents a replica from being created by cloning
    clonefrom: false
    # Prevents synchronous replication from being enforced
    nosync: false

7. With the configurations saved, run the commands below to collectively prepare the Patroni data directory, ensuring it’s appropriately owned and secured for PostgreSQL’s use.

These commands do not provide output, but this step is crucial in setting up a PostgreSQL database cluster with Patroni for high availability.

# Create the directory for Patroni data
mkdir -p /var/lib/patroni

# Set ownership of the Patroni data directory to the PostgreSQL user
chown -R postgres:postgres /var/lib/patroni

# Set permissions on the Patroni data directory for security
chmod 700 /var/lib/patroni

8. Next, execute the systemctl commands below to start and verify the patroni service.

systemctl start patroni
systemctl status patroni

On the postgres01 server, the PostgreSQL server runs, and the cluster initializes. Also, the postgres01 server is selected as the cluster leader.

Checking the Patroni service and verifying PostgreSQL cluster initialization
Checking the Patroni service and verifying PostgreSQL cluster initialization

On the postgres02 server, the PostgreSQL server starts via Patroni and joins the PostgreSQL cluster via REST API.

If all goes well, you’ll see the following message:

Checking the Patroni service on the postgres02 server
Checking the Patroni service on the postgres02 server

On the postgres03 server, the output is similar to the postgres02 server’s.

Checking the Patroni service on the postgres03 server
Checking the Patroni service on the postgres03 server

9. With your PostgreSQL cluster initialized, execute the following patronictl command to display a list of PostgreSQL instances managed by Patroni.

patronictl -c /etc/patroni/config.yml list

In the following output, you can see your PostgreSQL cluster (postgres) runs.

Notice that your cluster runs with three members: postgres01 as cluster Leader, postgres02, and postgres03 as Replica with mode/state streaming.

Checking the list of servers in the PostgreSQL Cluster
Checking the list of servers in the PostgreSQL Cluster

10. Lastly, run the systemctl command below to disable the automatic start of the postgresql service during system boot.

This command has no output if successful but is crucial since Patroni controls the new PostgreSQL server.

sudo systemctl disable --now postgresql

Installing and Configuring HAProxy as a Load Balancer

With your PostgreSQL cluster deployed, how can you make it accessible from clients and enable failover when a problem occurs? The solution is HAProxy as a load balancer before your PostgreSQL cluster.

HAProxy is your linchpin that enables your PostgreSQL cluster to handle varying workloads, distribute requests intelligently, and maintain high availability.

💡 NOTE: Install HAProxy on a separate server. In this case, the HAProxy server is installed on the haproxy server with an IP address of 192.168.5.16.

To install and configure HAProxy as a load balancer for the PostgreSQL cluster, proceed with these steps:

1. Open the /etc/hosts file using your preferred text editor, insert your PostgreSQL servers’ IP addresses and hostnames, save the changes, and close the file.

192.168.5.20    postgres01
192.168.5.21    postgres02
192.168.5.22    postgres03

2. Next, run the command below to update your package index.

apt update
Updating the repository on the HAProxy server
Updating the repository on the HAProxy server

3. Once updated, execute the below command to install the haproxy package to your system.

apt install haproxy
Installing HAProxy via APT
Installing HAProxy via APT

4. Now, run the following command to backup the default HARPOXY configuration to /etc/haproxy/haproxy.cfg.orig.

This command does not produce output but is a precautionary measure before making any modifications.

mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig

5. Next, create a new file called /etc/haproxy/haproxy.cfg using your preferred editor, and insert the following configuration. Ensure you replace each PostgreSQL server IP address with yours, save the file, and close the editor.

This HAProxy configuration sets up HAProxy as a load balancer for your PostgreSQL cluster with two proxies, as follows:

  • stats – This block runs on port 8080 and monitors HAProxy server performance and backends.
  • postgres – This block is the load balancer configuration for the PostgreSQL cluster.
# Global configuration settings
global
    # Maximum connections globally
    maxconn 100      
    # Logging settings      
    log 127.0.0.1 local2

# Default settings
defaults
    # Global log configuration
    log global
    # Set mode to TCP
    mode tcp
    # Number of retries
    retries 2
    # Client timeout
    timeout client 30m
    # Connect timeout
    timeout connect 4s
    # Server timeout
    timeout server 30m
    # Check timeout
    timeout check 5s

# Stats configuration
listen stats
    # Set mode to HTTP
    mode http
    # Bind to port 8080
    bind *:8080
    # Enable stats
    stats enable
    # Stats URI
    stats uri /

# PostgreSQL configuration
listen postgres
    # Bind to port 5432
    bind *:5432
    # Enable HTTP check
    option httpchk
    # Expect status 200
    http-check expect status 200
    # Server settings
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    # Define PostgreSQL servers
    server postgres01 192.168.5.20:5432 maxconn 100 check port 8008
    server postgres02 192.168.5.21:5432 maxconn 100 check port 8008
    server postgres03 192.168.5.22:5432 maxconn 100 check port 8008

6. With HAProxy configured, run the systemctl commands below to restart and verify (status) the haproxy service.

systemctl restart haproxy
systemctl status haproxy
Checking the HAProxy service status
Checking the HAProxy service status

7. Lastly, open your preferred web browser and visit the HAProxy IP address with port 8080 (i.e., http://192.168.5.16:8080/). In the output below, you can see the following:

In the output below, you can see the following:

  • The stats proxy for monitoring HAProxy status.
  • The postgres proxy is the load balancer for the PostgreSQL cluster.
    Notice the postgres02 and postgres03 servers are marked as down because both are running in streaming mode.
Checking HAProxy performance and information
Checking HAProxy performance and information

Testing the PostgreSQL Cluster’s Failover

Emerging from setting up HAProxy as your trusty load balancer, it’s now time to put your PostgreSQL cluster to the test. This crucial step will reveal the resilience of your high-availability setup. You must ensure that your PostgreSQL cluster remains robust and responsive even in the face of potential failures.

To test your PostgreSQL cluster’s failover, you’ll connect to the cluster from your client machine and verify the failover operations with the following steps:

1. Log in to your client machine, open a terminal, and run the psql command below to connect to the PostgreSQL via HAProxy load balancer.

psql -h 192.168.5.16 -p 5432 -U postgres

Input your PostgreSQL password when prompted. You can find the necessary password information in the /etc/patroni/config.yml file.

Connecting to PostgreSQL cluster via HAProxy load balancer
Connecting to PostgreSQL cluster via HAProxy load balancer

2. Once connected, run the following query to find which PostgreSQL server you’re connected to, and quit the current PostgreSQL session.

SELECT inet_server_addr() AS hostname;
quit

If your PostgreSQL installation is successful, you’ll be connected to the postgres01 server.

Checking the IP address of the PostgreSQL server
Checking the IP address of the PostgreSQL server

3. Now, switch to the postgres01 server, run the following commands to stop the patroni service, and list the status of the PostgreSQL clusters.

This step lets you test the PostgreSQL failover.

systemctl stop patroni
patronictl -c /etc/patroni/config.yml list

You can see the state of the postgres01 server changed to stopped, and the new cluster leader is delegated to the postgres03 server.

Stopping the Patroni service to test the failover
Stopping the Patroni service to test the failover

4. Return to the HAProxy monitoring stats, and you’ll see the postgres01 server is DOWN, while postgres03 is now UP.

Verifying the failover works in the HAProxy monitoring stats report
Verifying the failover works in the HAProxy monitoring stats report

Alternatively, run the following patronictl command to verify the PostgreSQL cluster status.

patronictl -c /etc/patroni/config.yml list

As you can see below, the postgres01 server is no longer in the cluster.

Checking the PostgreSQL Cluster members via patronictl
Checking the PostgreSQL Cluster members via patronictl

5. Switch back to the client machine, and run the below psql command to connect to the PostgreSQL server via HAProxy.

psql -h 192.168.5.16 -p 5432 -U postgres
Connecting to the PostgreSQL server via HAProxy
Connecting to the PostgreSQL server via HAProxy

6. Once connected, run the following query to verify the current PostgreSQL server you’re connected to.

SELECT inet_server_addr() AS hostname;
quit

If the failover is successful, you’ll be connected to one of the servers running, which in this case is postgres03.

Checking the PostgreSQL server IP address
Checking the PostgreSQL server IP address

Conclusion

Embarking on this journey, you immersed yourself in the complexities of ensuring high availability for your PostgreSQL database, employing the powerful combination of Patroni and HAProxy. Navigating seamlessly through the setup stages of PostgreSQL and Patroni, you adeptly handled the nuances of configuring the etcd server.

Your orchestration skills came to the forefront as you built a resilient PostgreSQL cluster with Patroni and fine-tuned the art of load balancing using HAProxy. The culmination of this high-stakes adventure was the thorough testing of your PostgreSQL cluster’s failover capabilities.

Consider expanding your expertise as you reflect on your achievements in establishing a robust and fault-tolerant PostgreSQL environment. Why not explore implementing Patroni with Kubernetes for a more dynamic environment? Or delve into the intricacies of setting up PostgreSQL High Availability across multiple data centers?

Hate ads? Want to support the writer? Get many of our tutorials packaged as an ATA Guidebook.

Explore ATA Guidebooks

Looks like you're offline!