Postgres 10 Logical Replication
See also https://blog.2ndquadrant.com/logical-replication-postgresql-10/ although I'm not sure how long that will hang around, the internet being what it is.
These are my notes on getting logical replication going in postgres between a "master" and a "slave" node. Both systems should have port 5432 (the default postgres port) open to each other. On CentOS 7, this required the following commands as root :
# firewall-cmd --zone=public --add-port=5432/tcp --permanent # firewall-cmd --reload
And do this to be sure port 5432 is listed as being open :
# firewall-cmd --list-ports
There may be other firewall issues to address depending on your setup.
PART 1 : THINGS TO DO ON BOTH NODES
Install PG 10 or greater, so as root :
# yum install postgresql10-server postgresql10-contrib postgresql10-devel
Strictly speaking you don't need postgresql10-devel, but I use it for other stuff so I grab it.
Then init the database and make postgres run by default :
# /usr/pgsql-10/bin/postgresql-10-setup initdb # systemctl start postgresql-10 # systemctl enable postgresql-10
Users using postgres 10 would be wise to add the PG10 binary directory to the start of their path, so add this to $HOME/.bashrc :
PATH=/usr/pgsql-10/bin:$PATH; export PATH
You should then be able to become the linux user 'postgres' :
# su - postgres
And run 'psql', set the superuser password and exit the database :
postgres=# ALTER USER postgres WITH PASSWORD 'bigBadTruck'; postgres=# \q
Then as 'postgres' edit the file $HOME/10/data/postgresql.conf and change the line :
#wal_level = replica
To :
wal_level = logical
Still as postgres, edit the pg_hba.conf file. It will be named something like 10/data/pg_hba.conf. If the IP addresses of the two machines are 100.37.24.123 and 100.37.24.122, I set it up to look like this on both machines :
# TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all md5 local all all peer # IPv4 local connections: host all all 127.0.0.1/32 ident host all all 127.0.0.1/32 md5 host all all localhost md5 host all all localhost ident host bladerunner deckard 100.37.24.123/32 md5 host bladerunner deckard 100.37.24.122/32 md5 host bladerunner postgres 100.37.24.123/32 md5 host bladerunner postgres 100.37.24.122/32 md5 # IPv6 local connections: host all all ::1/128 ident # Allow replication connections from localhost, by a user with the # replication privilege. local replication all peer hostnossl replication all 100.37.24.122/32 md5 hostnossl replication all 100.37.24.123/32 md5 host replication all 127.0.0.1/32 ident host replication all ::1/128 ident
'deckard' and 'bladerunner' are a user and a database we'll set up in a bit.
That is almost certainly overkill as it allows full communication both ways, but that may be good for testing here. It is probably worth looking into using 'host' rather than 'hostnossl', too, which I *think* needs encryption.
And then restart the database with :
/usr/pgsql-10/bin/pg_ctl restart -D /var/lib/pgsql/10/data
There will likely be more database restarts required, it may be worth putting that restart command in a script in the home directory for user postgres.
You should then be able to revert to being logged in as whoever you are (the default for me being 'noien') and edit the file $HOME/.pgpass to look like :
$ cat $HOME/.pgpass localhost:*:*:postgres:bigBadTruck
Set protections on that file appropriately :
$ chmod 600 $HOME/.pgpass
You should then be able, as yourself, to access the database with :
$ psql -Upostgres
Check that you can, and once at the database prompt, create a user with replication :
CREATE USER deckard WITH PASSWORD 'origami'; CREATE DATABASE bladerunner; GRANT ALL PRIVILEGES ON DATABASE bladerunner TO deckard; ALTER USER deckard WITH REPLICATION;
Exit the database prompt and add this user to $HOME/.pgpass so that the file looks like :
$ cat $HOME/.pgpass localhost:*:*:postgres:bigBadTruck localhost:*:*:deckard:origami
You should then be able to get to the database prompt as user deckard using database bladerunner :
$ psql -Udeckard bladerunner
At that prompt, create a simple table :
DROP TABLE IF EXISTS expireTimes; CREATE TABLE expireTimes ( retirement VARCHAR(255) );
You need to do this on both master and slave nodes, replication will not make the table for you on the slave node.
Then check connectivity between the two machines. On both machines, add the users for the database on the other machine to the $HOME/.pgpass file, like so :
$ cat $HOME/.pgpass localhost:*:*:postgres:bigBadTruck localhost:*:*:deckard:origami the.other.machine.name:5432:bladerunner:deckard:origami the.other.machine.name:5432:*:postgres:bigBadTruck
You should then, on both machines, be able to access the other machine both as the postgres user and the deckard user, using the bladerunner database, like so :
psql -h the.other.machine.name -U postgres bladerunner psql -h the.other.machine.name -U deckard bladerunner
Making sure those commands work at the linux prompt ensures that the connectivity between the databases on the two machines is OK.
So, you now have postgres 10 running on two systems. Both of them have a user named deckard, with a database named bladerunner, and in that database is an empty table named expireTimes. Connectivity between the two machines is OK.
PART 2 : SETUP ON THE MASTER NODE
In order to get some data flowing through the expireTimes table on the master node, write a script called "doInsert.sh" that looks like this :
#!/bin/bash t=`date +"%Y/%m/%d %H:%M:%S %B %A %d"` echo INSERT INTO expireTimes VALUES \(\'$t\'\)\; | psql -Udeckard bladerunner exit 0
Then set executable status on that file :
$ chmod +x doInsert.sh
And similarly write a script called "doDelete.sh" that looks like :
#!/bin/bash sleep 30; t=`date --date="1 hour ago" +"%Y/%m/%d %H:%M:%S %B %A %d"` echo DELETE FROM expireTimes WHERE retirement \< \'$t\'\; | psql -Udeckard bladerunner exit 0
And again set executable status on the file. The first script will insert the current time (in a somewhat screwball format) into the database, the second script will delete times that are older than an hour.
Run them both in cron, running the insert one every minute and the delete one every ten minutes, like so :
* * * * * /home/noien/replication/setup/doInsert.sh &> /home/noien/replication/setup/doInsert.log */10 * * * * /home/noien/replication/setup/doDelete.sh &> /home/noien/replication/setup/doDelete.log
Check that the times are going into the table :
$ psql -Udeckard bladerunner
psql (10.3)
Type "help" for help.
bladerunner=> select retirement from expireTimes order by retirement;
             retirement              
-------------------------------------
 2018/04/13 14:51:01 April Friday 13
 2018/04/13 14:52:01 April Friday 13
 2018/04/13 14:53:01 April Friday 13
 2018/04/13 14:54:01 April Friday 13
 2018/04/13 14:55:01 April Friday 13
After a while, there should always be between 60 and 70 entries in the table :
bladerunner=> select count(*) from expireTimes;
 count 
-------
    60
So, this is the little test table that we are going to replicate on the slave node. We have to "publish" this table on the master node.a Still in the database :
bladerunner=> CREATE PUBLICATION harrison FOR TABLE expireTimes;
Interestingly, after publishing the delete script will not be able to delete entries from the table. Attempts to delete will result in an error :
ERROR: cannot delete from table "expiretimes" because it does not have a replica identity and publishes deletes HINT: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.
So to follow that hint, do the following :
bladerunner=> ALTER TABLE expireTimes REPLICA IDENTITY FULL;
After which deletes can happen again. I'm not sure that FULL is the optimal setting for all cases, more reading may be required.
We should now be able to see the publication on the master node :
bladerunner=> \dRp+
                Publication harrison
  Owner  | All tables | Inserts | Updates | Deletes 
---------+------------+---------+---------+---------
 deckard | f          | t       | t       | t
Tables:
    "public.expiretimes"
PART 3 : SETUP ON THE SLAVE NODE
Subscriptions must be done by the database super user, postgres. So log in to the slave machine and go to the bladerunner database as postgres with :
$ psql -Upostgres bladerunner
Then create a subscription that connects to the publication we have on the master machine :
bladerunner=# CREATE SUBSCRIPTION ford CONNECTION 'host=the.master.machine.name dbname=bladerunner user=deckard password=origami' PUBLICATION harrison;
Note that you do need to specify the password, the .pgpass file does not affect the CREATE SUBSCRIPTION command. I'm not sure what would happen if the password on the master node were to change.
You should then be able to see the subscription :
bladerunner=# \dRs+
                                                          List of subscriptions
 Name |  Owner   | Enabled | Publication | Synchronous commit |                                 Conninfo                                  
------+----------+---------+-------------+--------------------+---------------------------------------------------------------------------
 ford | postgres | t       | {harrison}  | off                | host=the.master.machine.name dbname=bladerunner user=deckard password=origami
(1 row)
And monitor it in more depth with the pg_stat_subscription table :
bladerunner=# SELECT * FROM pg_stat_subscription; subid | subname | pid | relid | received_lsn | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time -------+---------+-------+-------+--------------+-------------------------------+-------------------------------+----------------+------------------------------- 16413 | ford | 22102 | | 0/199F168 | 2018-04-13 16:57:05.488469-06 | 2018-04-13 16:56:43.719121-06 | 0/199F168 | 2018-04-13 16:57:05.488469-06
Similarly on the master machine you should be able to access the database (this time there's no need to be the postgres user) with :
$ psql -Udeckard bladerunner
And then monitor the other end by looking at the pg_stat_replication table :
bladerunner=> select pid, usesysid, usename, application_name, client_addr, client_hostname, client_port, backend_start from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start -------+----------+---------+------------------+---------------+--------------------------+-------------+------------------------------- 31018 | 16384 | deckard | ford | 100.37.24.123 | the.slave.machine.name | 32826 | 2018-04-13 16:38:52.349168-06
And the two tables on the two machines should then be mirror images of each other, with updates on the master popping up on the slave (also sometimes called the 'standby') machine.
NOTE : If a new table is added to the publication on the master, the slave subscription will not see them by default. In that case you would have to do this on the slave node :
bladerunner=# ALTER SUBSCRIPTION ford REFRESH PUBLICATION;
PART 4 : TRIGGERS
If we want to keep track of what was inserted when into our new, replicated table, we might reasonably think that the following trigger setup would do it :
-- Make a table to hold what was inserted into the expireTimes table, and when it was inserted DROP TABLE insertTimes; CREATE TABLE insertTimes( insertTime TIMESTAMP, insertValue VARCHAR(255)); -- Make a function that will update the insertTimes table CREATE OR REPLACE FUNCTION insert_trigger_fc() returns TRIGGER as $insert_trigger$ BEGIN INSERT INTO insertTimes VALUES (LOCALTIMESTAMP, NEW.RETIREMENT); DELETE FROM insertTimes WHERE insertTime < LOCALTIMESTAMP - INTERVAL '1 HOUR'; RETURN NEW; END $insert_trigger$ LANGUAGE plpgsql; -- Make a trigger that will call the function when expireTimes is updated DROP TRIGGER IF EXISTS insert_trigger ON expireTimes; CREATE TRIGGER insert_trigger AFTER INSERT OR UPDATE ON expireTimes FOR EACH ROW EXECUTE PROCEDURE insert_trigger_fc();
And that does work on the master node :
bladerunner=> SELECT * FROM  insertTimes;
         inserttime         |             insertvalue             
----------------------------+-------------------------------------
 2018-04-16 11:25:01.16447  | 2018/04/16 11:25:01 April Monday 16
 2018-04-16 11:26:01.849943 | 2018/04/16 11:26:01 April Monday 16
However on the slave node the trigger will not fire by default. You have to either :
ALTER TABLE expireTimes ENABLE REPLICA TRIGGER insert_trigger;
or (and I prefer this, because if I have a trigger I pretty much want it to fire) :
ALTER TABLE expireTimes ENABLE ALWAYS TRIGGER insert_trigger;
Niles Oien April 2018.

