Setting Up a Patroni Cluster with VMware Postgres
VMware Postgres includes the Patroni High Availability (HA) solution. This topic describes how to set up an HA three-node cluster using Patroni and etcd. If a host becomes unavailable, this configuration provides failover to a standby host.
Patroni is a tool that can be used to create, manage, maintain, and monitor highly available clusters using Postgres streaming replication. Patroni handles the Postgres database initialization as well as planned switchovers or unplanned failovers.
This topic shows how to set up a three-node CentOS 7 HA cluster using Patroni and etcd. In this three-node cluster a Patroni instance on each node initializes and manages its own Postgres instance and monitors its health. Patroni uses health data to decide if any actions are required to keep the Patroni cluster healthy as a whole.
A single Patroni instance can acquire and keep the “leader lock” (primary role). This is where etcd comes in. etcd is a Distributed Consensus Store (DCS). It provides the distributed decision-making mechanism that determines which Patroni instance should be the leader and which instances should be standbys.
In our example, we will be creating a three-node etcd cluster. (Apache ZooKeeper can also be used with Patroni but will not be covered here.) The number of etcd nodes does not depend on the number of Patroni nodes in the cluster. However, you should have more than one etcd node to prevent a single point of failure, and it is best to have at least three etcd nodes to ensure a majority during leader voting.
About Postgres Replication
A Patroni HA cluster relies on Postgres replication to maintain the standby Postgres databases.
Postgres replication streams write-ahead logs (WAL) from a primary Postgres server to one or more standby Postgres servers. One Postgres server is designated the primary and one or more others are standbys.
Any database changes (DML operations) made in the primary server are first written to an in-memory WAL buffer. Changes are then immediately written to a WAL segment file when the transaction commits or aborts. The WAL segment files are in the primary cluster’s pg_wal
directory.
After the WAL file is created, the primary cluster’s walsender
process sends the file to the standbys’ walreceiver
processes. Eacg standby’s walreceiver
process writes the WAL data into WAL segment files in the standby’s pg_wal
directory and informs the startup
process. The startup
process then replays the WAL data, much like a recovery process, and then the data is in sync with the primary.
The following ps
output shows the walsender
processes on the primary host, and the walreceiver
and startup
processes on a standby host.
Primary:
postgres 12391 12347 0 17:07 ? 00:00:00 postgres: patroni_test: walsender replicator 10.193.102.204(29906) streaming 0/4132360
postgres 14013 12347 0 19:05 ? 00:00:00 postgres: patroni_test: walsender replicator 10.193.102.205(33855) streaming 0/4132360
Standby:
postgres 11916 11914 0 17:07 ? 00:00:00 postgres: patroni_test: startup recovering 0000000A0000000000000004
postgres 11928 11914 0 17:07 ? 00:00:09 postgres: patroni_test: walreceiver streaming 0/4132360
Installing Prerequisite Software
Install and other required packages on all three CentOS 7 hosts. Log in as root, or use sudo to execute these commands.
Download the RPM from VMware Tanzu Network and install it on all three nodes.
# yum install vmware-postgres-11.7-3.el7.x86_64.rpm
Install the etcd and Patroni prerequisites on all three nodes.
# sudo yum install -y python3 python3-devel gcc # sudo yum install -y etcd # sudo yum install -y python-psycopg2
# cat << EOF > requirements.txt urllib3>=1.19.1,!=1.21 boto PyYAML requests six >= 1.7 kazoo>=1.3.1 python-etcd>=0.4.3,<0.5 python-consul>=0.7.0 click>=4.1 prettytable>=0.7 tzlocal python-dateutil psutil>=2.0.0 cdiff kubernetes>=2.0.0,<=7.0.0,!=4.0.*,!=5.0.* EOF # pip3 install -r requirements.txt
Change the permissions on the
/tmp
directory.# chmod 775 /tmp
Configuring etcd
The method we use for setting up etcd is called static bootstrapping. This method requires knowledge of all cluster members’ IP addresses and that the addresses are static. The other bootstrapping method etcd offers is discovery bootstrapping. Discovery bootstrapping is only relevant when the etcd nodes are part of a container such as Kubernetes, which has its own special network setup. Discovery boostrapping is not covered in this topic.
The hosts in this document have IP addresses 10.193.102.203
, 10.193.102.204
, and 10.193.102.205
. Substitute the actual IP addresses of your hosts in these instructions.
Follow these steps to set up the etcd cluster.
Log in to the first node and become the postgres user.
$ sudo su - postgres $ cd $HOME
Create the
etcd.yml
file for the first node. This file is used to start the etcd cluster.$ vi /var/lib/pgsql/etcd.yml
Insert the following contents and save the file.
name: '<hostname1>' listen-peer-urls: 'http://10.193.102.203:2380' listen-client-urls: 'http://10.193.102.203:2379,http://127.0.0.1:2379' initial-advertise-peer-urls: 'http://10.193.102.203:2380' advertise-client-urls: 'http://10.193.102.203:2379' initial-cluster: '<hostname1>=http://10.193.102.203:2380,<hostname2>=http://10.193.102.204:2380,<hostname3>=http://10.193.102.205:2380' initial-cluster-state: 'new' initial-cluster-token: 'etcd-cluster-1'
Where:
<hostname1>
,<hostname2>
, and<hostname3>
are the hostnames of the three nodes.10.193.102.203
is the IP address of host 1.10.193.102.204
is the IP address of host 2.10.193.102.205
is the IP address of host 3.
Start up etcd.
$ etcd --config-file /var/lib/pgsql/etcd.yml > etcd_logfile 2>&1 &
The background process should now be running. If it terminates, check the
etcd_logfile
for any errors. There could be an issue with the configuration file, or the2379
or2380
ports may already be in use.Log in to the second node (10.193.102.204) and become the postgres user.
$ sudo su - postgres $ cd $HOME
Create the
etcd.yml
configuration file for the second node.$ vi /var/lib/pgsql/etcd.yml
Insert the following contents and save the file.
name: '<hostname2>' listen-peer-urls: 'http://10.193.102.204:2380'; listen-client-urls: 'http://10.193.102.204:2379,http://127.0.0.1:2379'; initial-advertise-peer-urls: 'http://10.193.102.204:2380'; advertise-client-urls: 'http://10.193.102.204:2379'; initial-cluster: '<hostname1>=http://10.193.102.203:2380,<hostname2>=http://10.193.102.204:2380,<hostname3>=http://10.193.102.205:2380' initial-cluster-state: 'new' initial-cluster-token: 'etcd-cluster-1'
Where:
<hostname1>
,<hostname2>
, and<hostname3>
are the hostnames of the three nodes.10.193.102.203
is the IP address of host 1.10.193.102.204
is the IP address of host 2.10.193.102.205
is the IP address of host 3.
Start etcd on the second node.
$ etcd --config-file /var/lib/pgsql/etcd.yml > etcd_logfile 2>&1 &
Log in to the third node (10.193.102.205) and become the postgres user.
$ sudo su - postgres $ cd $HOME
Create the
etcd.yml
configuration file for the third node.$ vi /var/lib/pgsql/etcd.yml
Insert the following contents and save the file.
name: '<hostname3>' listen-peer-urls: 'http://10.193.102.205:2380' listen-client-urls: 'http://10.193.102.205:2379,http://127.0.0.1:2379' initial-advertise-peer-urls: 'http://10.193.102.205:2380' advertise-client-urls: 'http://10.193.102.205:2379' initial-cluster: '<hostname1>=http://10.193.102.203:2380,<hostname2>=http://10.193.102.204:2380,<hostname3>=http://10.193.102.205:2380' initial-cluster-state: 'new' initial-cluster-token: 'etcd-cluster-1'
Where:
<hostname1>
,<hostname2>
, and<hostname3>
are the hostnames of the three nodes.10.193.102.203
is the IP address of host 1.10.193.102.204
is the IP address of host 2.10.193.102.205
is the IP address of host 3.
Start etcd on the third node.
$ etcd --config-file /var/lib/pgsql/etcd.yml > etcd_logfile 2>&1 &
Monitoring and Managing etcd
Use the etcdctl
command-line utility to manage the etcd cluster. View the command help by entering the command with no arguments. More functionality is available if you set the ETDCTL_API
environment variable to 3
.
$ export ETCDCTL_API=3
Verify that the etcd Setup is Correct
$ etcdctl cluster-health
member 3172be2b0510829 is healthy: got healthy result from http://10.193.102.203:2379
member bf2f62dbbe5028b6 is healthy: got healthy result from http://10.193.102.204:2379
member e03eb2acfb7726f9 is healthy: got healthy result from http://10.193.102.205:2379
cluster is healthy
View and Manage etcd Cluster Members
With the etcdctl member
command you can list the members of the etcd cluster and add, remove, or update members. With no arguments, the command displays a help message.
$ etcdctl member
NAME:
etcdctl member - member add, remove and list subcommands
USAGE:
etcdctl member command [command options] [arguments...]
COMMANDS:
list enumerate existing cluster members
add add a new member to the etcd cluster
remove remove an existing member from the etcd cluster
update update an existing member in the etcd cluster
OPTIONS:
--help, -h show help
[gpadmin@mdw ~]$ etcdctl member
NAME:
etcdctl member - member add, remove and list subcommands
USAGE:
etcdctl member command [command options] [arguments...]
COMMANDS:
list enumerate existing cluster members
add add a new member to the etcd cluster
remove remove an existing member from the etcd cluster
update update an existing member in the etcd cluster
OPTIONS:
--help, -h show help
This command lists the members and identifies the current Leader.
$ etcdctl member list
3172be2b0510829: name=postgres1 peerURLs=http://10.193.102.203:2380 clientURLs=http://10.193.102.203:2379 isLeader=false
bf2f62dbbe5028b6: name=postgres2 peerURLs=http://10.193.102.204:2380 clientURLs=http://10.193.102.204:2379 isLeader=true
e03eb2acfb7726f9: name=postgres3 peerURLs=http://10.193.102.205:2380 clientURLs=http://10.193.102.205:2379 isLeader=false
Configuring Patroni
Each host also has its own Patroni YAML file. This example provides the minimal settings for getting up and running with Patroni. If you want to be adventurous and try out different settings, you can check out all the options in the Patroni documentation.
Log in to the first node and become the postgres user.
$ sudo su - postgres $ cd $HOME
Create the
postgresql0.yml
configuration file.$ vi postgresql0.yml
Insert the following contents and save the file.
scope: patroni_test name: postgresql0 restapi: listen: '10.193.102.203:8008' connect_address: '10.193.102.203:8008' etcd: hosts: '10.193.102.203:2379,10.193.102.204:2379,10.193.102.205:2379' bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true use_slots: true parameters: hot_standby: 'on' wal_keep_segments: 20 max_wal_senders: 8 max_replication_slots: 8 initdb: - encoding: UTF8 - data-checksums pg_hba: - host replication replicator 10.193.102.203/32 md5 - host replication replicator 10.193.102.204/32 md5 - host replication replicator 10.193.102.205/32 md5 - host all all 0.0.0.0/0 md5 users: admin: password: admin options: - createrole - createdb postgresql: listen: '10.193.102.203:5432' connect_address: '10.193.102.203:5432' data_dir: /var/lib/pgsql/data pgpass: /tmp/pgpass0 authentication: replication: username: replicator password: rep-pass superuser: username: postgres password: postgres rewind: username: rewind_user password: rewind_password parameters: unix_socket_directories: . tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
Where:
postgresql0
is the name of this Patroni cluster member.10.193.102.203
is the IP address of host 1.10.193.102.204
is the IP address of host 2.10.193.102.205
is the IP address of host 3.
Start Patroni.
$ patroni postgresql0.yml > patroni_member_1.log 2>&1 &
This command starts a Patroni process that initializes the Postgres database on the host. If the process terminates, check the
patroni_member_1.log
log file for errors. A successful startup message is similar to this:2020-03-17 23:32:58.648 UTC [10491] LOG: database system is ready to accept connections 10.193.102.222:5432 - accepting connections 10.193.102.222:5432 - accepting connections 2020-03-17 23:32:59,608 INFO: establishing a new patroni connection to the postgres cluster 2020-03-17 23:32:59,631 INFO: running post_bootstrap 2020-03-17 23:32:59,651 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'" 2020-03-17 23:32:59,665 INFO: initialized a new cluster 2020-03-17 23:33:09,645 INFO: Lock owner: postgresql0; I am postgresql0 2020-03-17 23:33:09,668 INFO: Lock owner: postgresql0; I am postgresql0 2020-03-17 23:33:09,684 INFO: no action. i am the leader with the lock
Log in to the new database. The password for the postgres user is
postgres
. (It was set in thepostgresql
section of the YAML file you created.)$ psql -h segment1 -U postgres Password for user postgres: psql (11.6 (Pivotal Postgres 11.6.1)) Type "help" for help. postgres=#
Repeat steps 1 through 4 for the second and third nodes, with the following differences:
On the second node, the name of the YAML file is
postgresql1.yml
. In the file, set the value of thename
attribute topostgresql1
and use the second node’s IP address in thelisten
andconnect_address
values forrestapi
andpostgresql
.name: postgresql1 restapi: listen: 10.193.102.204:8008 connect_address: 10.193.102.204:8008 postgresql: listen: 10.193.102.204:5432 connect_address: 10.193.102.204:5432
Start Patroni with this command.
bash $ patroni postgresql1.yml > patroni_member_2.log 2>&1 &
- On the third node, the name of the YAML file is `postgresql2.yml`. In the file, set the value of the `name` attribute to `postgresql2` and use the third node's IP address in the `listen` and `connect_address` values for `restapi` and `postgresql`.
``` yaml
name: postgresql2
restapi:
listen: 10.193.102.205:8008
connect_address: 10.193.102.205:8008
postgresql:
listen: 10.193.102.205:5432
connect_address: 10.193.102.205:5432
```
Start Patroni with this command.
``` bash
$ patroni postgresql2.yml > patroni_member_3.log 2>&1 &
```
With Patroni started on all three nodes, use the
patronictl
command-line utility to view the cluster state.$ patronictl -c postgresql0.yml list +--------------+-------------+----------------+--------+---------+----+-----------+ | Cluster | Member | Host | Role | State | TL | Lag in MB | +--------------+-------------+----------------+--------+---------+----+-----------+ | patroni_test | postgresql0 | 10.193.102.203 | Leader | running | 12 | 0 | | patroni_test | postgresql1 | 10.193.102.204 | | running | 12 | 0 | | patroni_test | postgresql2 | 10.193.102.205 | | running | 12 | 0 | +--------------+-------------+----------------+--------+---------+----+-----------+
Next Steps
The Patroni cluster is now set up.
- To check that WAL replication is working, create a table on the leader and insert data into it. Verify that the table appears on each of the standby hosts.
Manually switch the leader to another node using this command.
$ patronictl -c postgres1.yml switchover
Simulate an unplanned failover by rebooting any one of the hosts.