=== Postgres 10 Logical Replication ===

See also 
https://blog.2ndquadrant.com/logical-replication-postgresql-10/
although I'm not sure how long that will hang around, the internet being what it is.

These are my notes on getting logical replication going in postgres
between a "master" and a "slave" node. Both systems should have port 5432
(the default postgres port) open to each other. On CentOS 7, this
required the following commands as root :

{{{
# firewall-cmd --zone=public --add-port=5432/tcp --permanent
# firewall-cmd --reload
}}}

And do this to be sure port 5432 is listed as being open :
{{{
# firewall-cmd --list-ports
}}}

There may be other firewall issues to address depending on your setup.

=== PART 1 : THINGS TO DO ON BOTH NODES ===

Install PG 10 or greater, so as root :

{{{
# yum install postgresql10-server postgresql10-contrib postgresql10-devel
}}}

Strictly speaking you don't need postgresql10-devel, but I use it for other stuff so
I grab it.

Then init the database and make postgres run by default :

{{{
# /usr/pgsql-10/bin/postgresql-10-setup initdb
# systemctl start postgresql-10
# systemctl enable postgresql-10
}}}

Users using postgres 10 would be wise to add the PG10 binary directory
to the start of their path, so add this to $HOME/.bashrc :

{{{
PATH=/usr/pgsql-10/bin:$PATH; export PATH
}}}

You should then be able to become the linux user 'postgres' :

{{{
# su - postgres
}}}

And run 'psql', set the superuser password and exit the database :

{{{
postgres=# ALTER USER postgres WITH PASSWORD 'bigBadTruck';
postgres=# \q
}}}

Then as 'postgres' edit the file $HOME/10/data/postgresql.conf and change the line :
{{{
#wal_level = replica
}}}
To :
{{{
wal_level = logical
}}}

Still as postgres, edit the pg_hba.conf file. It will be named something like
10/data/pg_hba.conf. If the IP addresses of the two machines are 100.37.24.123
and 100.37.24.122, I set it up to look like this on both machines :

{{{
# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     md5
local   all             all                                     peer
# IPv4 local connections:
host    all             all             127.0.0.1/32            ident
host    all             all             127.0.0.1/32            md5
host    all             all             localhost               md5
host    all             all             localhost               ident
host    bladerunner     deckard         100.37.24.123/32         md5
host    bladerunner     deckard         100.37.24.122/32         md5
host    bladerunner     postgres        100.37.24.123/32         md5
host    bladerunner     postgres        100.37.24.122/32         md5
# IPv6 local connections:
host    all             all             ::1/128                 ident
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     peer
hostnossl    replication     all             100.37.24.122/32         md5
hostnossl    replication     all             100.37.24.123/32         md5
host    replication     all             127.0.0.1/32            ident
host    replication     all             ::1/128                 ident
}}}

'deckard' and 'bladerunner' are a user and a database we'll set up in a bit.

That is almost certainly overkill as it allows full communication
both ways, but that may be good for testing here. It is probably worth
looking into using 'host' rather than 'hostnossl', too, which I *think* needs
encryption.

And then restart the database with :
{{{
/usr/pgsql-10/bin/pg_ctl restart -D /var/lib/pgsql/10/data
}}}

There will likely be more database restarts required, it may be worth putting that restart
command in a script in the home directory for user postgres.

You should then be able to revert to being logged in as whoever you are (the default
for me being 'noien') and edit the file $HOME/.pgpass to look like :

{{{
$ cat $HOME/.pgpass
localhost:*:*:postgres:bigBadTruck
}}}

Set protections on that file appropriately :
{{{
$ chmod 600 $HOME/.pgpass
}}}

You should then be able, as yourself, to access the database with :
{{{
$ psql -Upostgres
}}}
Check that you can, and once at the database prompt, create a user with replication :

{{{
CREATE USER deckard WITH PASSWORD 'origami';
CREATE DATABASE bladerunner;
GRANT ALL PRIVILEGES ON DATABASE bladerunner TO deckard;
ALTER USER deckard WITH REPLICATION;
}}}

Exit the database prompt and add this user to $HOME/.pgpass so that the file looks like :

{{{
$ cat $HOME/.pgpass
localhost:*:*:postgres:bigBadTruck
localhost:*:*:deckard:origami
}}}

You should then be able to get to the database prompt as user deckard using database bladerunner :
{{{ 
$ psql -Udeckard bladerunner 
}}}

At that prompt, create a simple table :

{{{
DROP TABLE IF EXISTS expireTimes;
CREATE TABLE expireTimes ( retirement VARCHAR(255) );
}}}

You need to do this on both master and slave nodes, replication will not make the
table for you on the slave node.

Then check connectivity between the two machines. On both machines, add the
users for the database on the other machine to the $HOME/.pgpass file, like so :

{{{
$ cat $HOME/.pgpass
localhost:*:*:postgres:bigBadTruck
localhost:*:*:deckard:origami
the.other.machine.name:5432:bladerunner:deckard:origami
the.other.machine.name:5432:*:postgres:bigBadTruck
}}}

You should then, on both machines, be able to access the other machine both as the
postgres user and the deckard user, using the bladerunner database, like so : 

{{{
psql -h the.other.machine.name -U postgres bladerunner
psql -h the.other.machine.name -U deckard bladerunner
}}}

Making sure those commands work at the linux prompt ensures that the connectivity 
between the databases on the two machines is OK.

So, you now have postgres 10 running on two systems. Both of them have a user named
deckard, with a database named bladerunner, and in that database is an empty table named
expireTimes. Connectivity between the two machines is OK.



=== PART 2 : SETUP ON THE MASTER NODE ===

In order to get some data flowing through the expireTimes table on the master node, write
a script called "doInsert.sh" that looks like this :

{{{
#!/bin/bash
t=`date +"%Y/%m/%d %H:%M:%S %B %A %d"`
echo INSERT INTO expireTimes VALUES \(\'$t\'\)\; | psql -Udeckard bladerunner
exit 0
}}}

Then set executable status on that file :
{{{ 
$ chmod +x doInsert.sh 
}}}

And similarly write a script called "doDelete.sh" that looks like :

{{{
#!/bin/bash
sleep 30;
t=`date --date="1 hour ago" +"%Y/%m/%d %H:%M:%S %B %A %d"`
echo DELETE FROM expireTimes WHERE retirement \< \'$t\'\; | psql -Udeckard bladerunner
exit 0
}}}

And again set executable status on the file. The first script will insert the current
time (in a somewhat screwball format) into the database, the second script will delete 
times that are older than an hour.

Run them both in cron, running the insert one every minute and the delete one every
ten minutes, like so :

{{{
* * * * * /home/noien/replication/setup/doInsert.sh &> /home/noien/replication/setup/doInsert.log
*/10 * * * * /home/noien/replication/setup/doDelete.sh &> /home/noien/replication/setup/doDelete.log
}}}

Check that the times are going into the table :

{{{
$ psql -Udeckard bladerunner
psql (10.3)
Type "help" for help.

bladerunner=> select retirement from expireTimes order by retirement;
             retirement              
-------------------------------------
 2018/04/13 14:51:01 April Friday 13
 2018/04/13 14:52:01 April Friday 13
 2018/04/13 14:53:01 April Friday 13
 2018/04/13 14:54:01 April Friday 13
 2018/04/13 14:55:01 April Friday 13
}}}

After a while, there should always be between 60 and 70 entries in the table :

{{{
bladerunner=> select count(*) from expireTimes;
 count 
-------
    60
}}}

So, this is the little test table that we are going to replicate on the slave node.
We have to "publish" this table on the master node.a Still in the database :

{{{ 
bladerunner=> CREATE PUBLICATION harrison FOR TABLE expireTimes; 
}}}


Interestingly, after publishing the delete script will not be able to delete entries
from the table. Attempts to delete will result in an error :

{{{
ERROR:  cannot delete from table "expiretimes" because it does not have a replica identity and publishes deletes
HINT:  To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.
}}}

So to follow that hint, do the following :

{{{ 
bladerunner=> ALTER TABLE expireTimes REPLICA IDENTITY FULL; 
}}}

After which deletes can happen again. I'm not sure that FULL is the optimal setting
for all cases, more reading may be required.

We should now be able to see the publication on the master node :

{{{
bladerunner=> \dRp+
                Publication harrison
  Owner  | All tables | Inserts | Updates | Deletes 
---------+------------+---------+---------+---------
 deckard | f          | t       | t       | t
Tables:
    "public.expiretimes"
}}}


=== PART 3 : SETUP ON THE SLAVE NODE ===


Subscriptions must be done by the database super user, postgres. So log in to the slave
machine and go to the bladerunner database as postgres with :

{{{
 $ psql -Upostgres bladerunner 
}}}

Then create a subscription that connects to the publication we have on the master machine :

{{{
 bladerunner=# CREATE SUBSCRIPTION ford CONNECTION 'host=the.master.machine.name dbname=bladerunner user=deckard password=origami' PUBLICATION harrison; 
}}}

Note that you do need to specify the password, the .pgpass file does not affect the CREATE SUBSCRIPTION command.
I'm not sure what would happen if the password on the master node were to change.

You should then be able to see the subscription :

{{{
bladerunner=# \dRs+
                                                          List of subscriptions
 Name |  Owner   | Enabled | Publication | Synchronous commit |                                 Conninfo                                  
------+----------+---------+-------------+--------------------+---------------------------------------------------------------------------
 ford | postgres | t       | {harrison}  | off                | host=the.master.machine.name dbname=bladerunner user=deckard password=origami
(1 row)
}}}

And monitor it in more depth with the pg_stat_subscription table :

{{{
bladerunner=# SELECT * FROM pg_stat_subscription;
 subid | subname |  pid  | relid | received_lsn |      last_msg_send_time       |     last_msg_receipt_time     | latest_end_lsn |        latest_end_time        
-------+---------+-------+-------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------
 16413 | ford    | 22102 |       | 0/199F168    | 2018-04-13 16:57:05.488469-06 | 2018-04-13 16:56:43.719121-06 | 0/199F168      | 2018-04-13 16:57:05.488469-06
}}}

Similarly on the master machine you should be able to access the database (this time there's no
need to be the postgres user) with :

{{{
 $ psql -Udeckard bladerunner 
}}}

And then monitor the other end by looking at the pg_stat_replication table :

{{{
bladerunner=> select pid, usesysid, usename, application_name, client_addr, client_hostname, client_port, backend_start from pg_stat_replication;
  pid  | usesysid | usename | application_name | client_addr   |     client_hostname      | client_port |         backend_start         
-------+----------+---------+------------------+---------------+--------------------------+-------------+-------------------------------
 31018 |    16384 | deckard | ford             | 100.37.24.123 | the.slave.machine.name   |       32826 | 2018-04-13 16:38:52.349168-06
}}}

And the two tables on the two machines should then be mirror images of each other, with updates on the master
popping up on the slave (also sometimes called the 'standby') machine.

NOTE : If a new table is added to the publication on the master, the slave subscription will not see them
by default. In that case you would have to do this on the slave node :

{{{
 bladerunner=# ALTER SUBSCRIPTION ford REFRESH PUBLICATION; 
}}}

=== PART 4 : TRIGGERS ===

If we want to keep track of what was inserted when into our new, replicated table, we might reasonably think that the following
trigger setup would do it :

{{{
-- Make a table to hold what was inserted into the expireTimes table, and when it was inserted
DROP TABLE insertTimes;
CREATE TABLE insertTimes( insertTime TIMESTAMP, insertValue VARCHAR(255));

-- Make a function that will update the insertTimes table
CREATE OR REPLACE FUNCTION insert_trigger_fc() returns TRIGGER as $insert_trigger$
 BEGIN
 INSERT INTO insertTimes VALUES (LOCALTIMESTAMP, NEW.RETIREMENT);
 DELETE FROM insertTimes WHERE insertTime <  LOCALTIMESTAMP - INTERVAL '1 HOUR';
 RETURN NEW;
END
$insert_trigger$ LANGUAGE plpgsql;

-- Make a trigger that will call the function when expireTimes is updated
DROP TRIGGER IF EXISTS insert_trigger ON expireTimes;
CREATE TRIGGER insert_trigger AFTER INSERT OR UPDATE ON expireTimes
 FOR EACH ROW EXECUTE PROCEDURE insert_trigger_fc();
}}}

And that does work on the master node :

{{{
bladerunner=> SELECT * FROM  insertTimes;
         inserttime         |             insertvalue             
----------------------------+-------------------------------------
 2018-04-16 11:25:01.16447  | 2018/04/16 11:25:01 April Monday 16
 2018-04-16 11:26:01.849943 | 2018/04/16 11:26:01 April Monday 16
}}}

However on the slave node the trigger will not fire by default. You have to either :

{{{
ALTER TABLE expireTimes ENABLE REPLICA TRIGGER insert_trigger;
}}}
or (and I prefer this, because if I have a trigger I pretty much want it to fire) :
{{{
ALTER TABLE expireTimes ENABLE ALWAYS TRIGGER insert_trigger;
}}}

Niles Oien April 2018.