Patroni is a fork of the now-unmaintained
Governor from Compose.
It is open source (GitHub repo) and
is documented here. Among other
things, Patroni supports bootstrapping, synchronous standbys, automatic failover
and maximum replication lag limit on failover.
Patroni needs a distributed configuration manager for coordinating it’s
actions, and supports ZooKeeper,
etcd and consul.
Patroni itself is written in Python and available on PyPi.
Let’s see how to setup an etcd-based Patroni to manage a primary-standby
PostgreSQL cluster. Practically, you’d have a 3-node or 5-node etcd cluster,
and the primary and standby on separate servers. However, we’re only interested
in getting a working Patroni setup so we can explore further, so we’ll set
everything up on a single server.
The commands below are for Debian 9, you should be able to use similar or same
commands on other distros too.
Setup etcd
etcd is available from it’s GitHub releases
page. Let’s download it and run it:
This will start and run etcd in the foreground. (Use ^C to exit.) etcd is being
run as a single-node cluster, which is useless from a redundancy perspective but
is good enough for experimentation and development use.
Install Patroni
Patroni and it’s dependencies are installed via pip:
The command is to be typed in verbatim, including the square brackets.
If you don’t have pip, you can install it with apt-get:
On RHEL/CentOS-based distros, you’ll need to “sudo yum install epel-release python-pip gcc”
before you can pip install.
Installing Postgres
Obviously, the Postgres binaries must be present on the system. You can install
the Postgres that your distro bundles, or follow the instructions on the
Postgres download page. Patroni does not
appear to be too dependent on the Postgres version.
For this example, we’ll use the 9.6 version included in Debian 9:
Which ever method and version you choose, make a note of the path for the
Postgres binaries, we’ll need this later. In this case, the path is:
Patroni Configuration Files
Patroni is invoked with the path to a configuration file. We’ll have two
configuration files, one for the primary and one for the slave. We’ll run both
on the same server, on different ports (5432 and 5433).
We’ll call the nodes “pg-1” and “pg-2”. They each have their own configuration
files. Here is the first one, “pg-1.yml”:
Here’s an explanation of the fields:
scope: This is the name of the Patroni-managed cluster, which can be used
in patronictl to refer to your cluster. All nodes should have the same value
for this key.
name: Name of the node, unique within the cluster.
restapi: Patroni has a REST API, which is started at this address
(“listen”). The “connect_address” is the address with which other nodes can
connect to this API, so the IP here should be the IP by which this node can be
reached from other nodes (typically over a private VLAN).
etcd: The configuration to connect to the etcd cluster. For a 3-node
etcd cluster, use “hosts: ip1:port1, ip2:port, ip3:port3”.
bootstrap: These values are used when a Patroni cluster is created. The
values under “postgresql.parameters” are actual postgresql.conf configuration
parameters. Some of the values, like wal_level and max_wal_senders are
required for streaming replication to work.
initdb: When bootstrapping the first node of a cluster and the PostgreSQL
data directory does not exist, these parameters will be used to invoke
initdb.
pg_hba: The entries that Patroni will add to the pg_hba.conf file of the
database it creates. See the “users” section below.
users: Patroni creates the list of users specified here. These users (who
should have been given access in pg_hba above) are then used in the
postgresql.authentication section below to let Patroni login to the Postgres
server. Here, the users “dba” (for admin access by Patroni) and “repl” (for
replication access from standby’s) are created.
postgresql: These parameters contain a bunch of information about the
PostgreSQL server (“pg-1”) managed by this Patroni node. The IP in
“connect_address” should be the IP from which other servers can reach this
server (typically over a private VLAN). We’re using 127.0.0.1 here since
everything is on the same node. The bin_dir is what we picked up earlier, and
we’re setting data and config dirs to “/data/pg-1”. For now, this directory
does not exist (“/data” should though). The “authentication” parameters should
refer to the replication and admin users that we created above in the “users”
section. Finally, the “parameters” section again contains postgresql.conf
configuration parameters that Patroni will pass to pg_ctl to start the
database.
And here is the second file, “pg-2.yml”:
It looks, and is, mostly similar to the first file. There are only a few
changes:
name: This has to be different, and this one we’re naming as “pg-2”.
restapi: The ports are different, since we’re running 2 Patroni’s on the
same node. This is typically not the case.
postgresql: The ports have been changed from 5432 to 5433. The data
directory is also now “/data/pg-2”. In the “parameters” section also, the port
is specified as 5433.
These configuration files have more options, you can read more about them
here. We’re only using
a minimal set to bring up a working cluster.
Starting the Cluster
Note that we don’t have any actual databases yet. We’re going to let Patroni
invoke initdb for us, using the parameters from the configuration files.
We already have etcd running. Let’s also make sure any default PostgreSQL
service is stopped first:
Now we can start the first node, by calling the “patroni” script with the path
to the first configuration file:
As you can see, it has invoked initdb to create a new database at /data/pg-1,
and started the server. You can connect to the server using psql -h /tmp -U dba
postgres at this point.
Now let’s bring up the standby, by running “patroni” with the second
configuration file, in another terminal:
Patroni has taken a backup using pg_basebackup, setup streaming replication
and brought up the PostgreSQL server. To verify, you can try connecting to the
standby with psql -h /tmp -p 5433 -U dba postgres. We’d specified “hot_standby”
as “on”, so read-only connections to the standby are possible.
Using Patronictl
Patroni includes a command called “patronictl” which can be used to control the
cluster. Let’s check the status of the cluster:
Manual Failover
The patronictl command can be used to do a manual failover. Since pg-1 is the
current master, we want to promote pg-2. Let’s do that with:
Looks like the failover happened. Let’s list the nodes again to confirm:
The node pg-2 has now indeed become the master. The Patroni output for pg-2
should contain log lines like:
Automatic Failover
When the current leader, pg-2, goes off the network, becomes unreachable, or
if the node crashes, then the Patroni at pg-1 should take over and promote
pg-1. Let’s see if that happens by killing the pg-2 Patroni process with a
^C:
Over in pg-1’s terminal, we can see it promoting itself:
and the node status confirms it:
Onward!
The aim of this post was to get you up and running with a Patroni cluster on
which you can start experimenting. Patroni can do much more than what we just
did with it.
You can read more about Patroni in the docs,
or go through the source here. The
original Compose article here
also has some information about how the solution works.
Monitoring PostgreSQL With OpsDash
With our own product,
OpsDash, you can quickly start monitoring your
PostgreSQL servers, and get instant insight into key performance and health
metrics including replication stats.
Here’s a default, pre-configured dashboard for PostgreSQL.
OpsDash strives to save you the tedious work of setting up a useful dashboard.
The metrics you see here were carefully chosen to represent the most relevant
health and performance indicators for a typical PostgreSQL instance.
OpsDash understands the streaming replication
feature of PostgreSQL and displays per-slave replication status on the master:
The replication lag (as a length of time) is also visible on a slave’s dashboard.
In the graph below (part of the dashboard of a slave), we can see that the slave
could not catch up to the master’s changes for a while.
Sign up for a free 14-day trial of OpsDash SaaS
today!