Changes between Initial Version and Version 1 of replicating


Ignore:
Timestamp:
04/13/18 16:59:07 (16 months ago)
Author:
niles
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • replicating

    v1 v1  
     1 
     2=== Postgres 10 Logical Replication === 
     3 
     4These are my notes on getting logical replication going in postgres 
     5between a "master" and a "slave" node. Both systems should have port 5432 
     6(the default postgres port) open to each other. On CentOS 7, this 
     7required the following commands as root : 
     8 
     9{{{ 
     10# firewall-cmd --zone=public --add-port=5432/tcp --permanent 
     11# firewall-cmd --reload 
     12}}} 
     13 
     14And do this to be sure port 5432 is listed as being open : 
     15{{{ 
     16# firewall-cmd --list-ports 
     17}}} 
     18 
     19There may be other firewall issues to address depending on your setup. 
     20 
     21=== PART 1 : THINGS TO DO ON BOTH NODES === 
     22 
     23Install PG 10 or greater, so as root : 
     24 
     25{{{ 
     26# yum install postgresql10-server postgresql10-contrib postgresql10-devel 
     27}}} 
     28 
     29Strictly speaking you don't need postgresql10-devel, but I use it for other stuff so 
     30I grab it. 
     31 
     32Then init the database and make postgres run by default : 
     33 
     34{{{ 
     35# /usr/pgsql-10/bin/postgresql-10-setup initdb 
     36# systemctl start postgresql-10 
     37# systemctl enable postgresql-10 
     38}}} 
     39 
     40Users using postgres 10 would be wise to add the PG10 binary directory 
     41to the start of their path, so add this to $HOME/.bashrc : 
     42 
     43{{{ 
     44PATH=/usr/pgsql-10/bin:$PATH; export PATH 
     45}}} 
     46 
     47You should then be able to become the linux user 'postgres' : 
     48 
     49{{{ 
     50# su - postgres 
     51}}} 
     52 
     53And run 'psql', set the superuser password and exit the database : 
     54 
     55{{{ 
     56postgres=# ALTER USER postgres WITH PASSWORD 'bigBadTruck'; 
     57postgres=# \q 
     58}}} 
     59 
     60Then as 'postgres' edit the file $HOME/10/data/postgresql.conf and change the line : 
     61{{{ 
     62#wal_level = replica 
     63}}} 
     64To : 
     65{{{ 
     66wal_level = logical 
     67}}} 
     68 
     69Still as postgres, edit the pg_hba.conf file. It will be named something like 
     7010/data/pg_hba.conf. If the IP addresses of the two machines are 100.37.24.123 
     71and 100.37.24.122, I set it up to look like this on both machines : 
     72 
     73{{{ 
     74# TYPE  DATABASE        USER            ADDRESS                 METHOD 
     75 
     76# "local" is for Unix domain socket connections only 
     77local   all             all                                     md5 
     78local   all             all                                     peer 
     79# IPv4 local connections: 
     80host    all             all             127.0.0.1/32            ident 
     81host    all             all             127.0.0.1/32            md5 
     82host    all             all             localhost               md5 
     83host    all             all             localhost               ident 
     84host    bladerunner     deckard         100.37.24.123/32         md5 
     85host    bladerunner     deckard         100.37.24.122/32         md5 
     86host    bladerunner     postgres        100.37.24.123/32         md5 
     87host    bladerunner     postgres        100.37.24.122/32         md5 
     88# IPv6 local connections: 
     89host    all             all             ::1/128                 ident 
     90# Allow replication connections from localhost, by a user with the 
     91# replication privilege. 
     92local   replication     all                                     peer 
     93hostnossl    replication     all             100.37.24.122/32         md5 
     94hostnossl    replication     all             100.37.24.123/32         md5 
     95host    replication     all             127.0.0.1/32            ident 
     96host    replication     all             ::1/128                 ident 
     97}}} 
     98 
     99'deckard' and 'bladerunner' are a user and a database we'll set up in a bit. 
     100 
     101That is almost certainly overkill as it allows full communication 
     102both ways, but that may be good for testing here. It is probably worth 
     103looking into using 'host' rather than 'hostnossl', too, which I *think* needs 
     104encryption. 
     105 
     106And then restart the database with : 
     107{{{ 
     108/usr/pgsql-10/bin/pg_ctl restart -D /var/lib/pgsql/10/data 
     109}}} 
     110 
     111There will likely be more database restarts required, it may be worth putting that restart 
     112command in a script in the home directory for user postgres. 
     113 
     114You should then be able to revert to being logged in as whoever you are (the default 
     115for me being 'noien') and edit the file $HOME/.pgpass to look like : 
     116 
     117{{{ 
     118$ cat $HOME/.pgpass 
     119localhost:*:*:postgres:bigBadTruck 
     120}}} 
     121 
     122Set protections on that file appropriately : 
     123{{{ 
     124$ chmod 600 $HOME/.pgpass 
     125}}} 
     126 
     127You should then be able, as yourself, to access the database with : 
     128{{{ 
     129$ psql -Upostgres 
     130}}} 
     131Check that you can, and once at the database prompt, create a user with replication : 
     132 
     133{{{ 
     134CREATE USER deckard WITH PASSWORD 'origami'; 
     135CREATE DATABASE bladerunner; 
     136GRANT ALL PRIVILEGES ON DATABASE bladerunner TO deckard; 
     137ALTER USER deckard WITH REPLICATION; 
     138}}} 
     139 
     140Exit the database prompt and add this user to $HOME/.pgpass so that the file looks like : 
     141 
     142{{{ 
     143$ cat $HOME/.pgpass 
     144localhost:*:*:postgres:bigBadTruck 
     145localhost:*:*:deckard:origami 
     146}}} 
     147 
     148You should then be able to get to the database prompt as user deckard using database bladerunner : 
     149{{{  
     150$ psql -Udeckard bladerunner  
     151}}} 
     152 
     153At that prompt, create a simple table : 
     154 
     155{{{ 
     156DROP TABLE IF EXISTS expireTimes; 
     157CREATE TABLE expireTimes ( retirement VARCHAR(255) ); 
     158}}} 
     159 
     160You need to do this on both master and slave nodes, replication will not make the 
     161table for you on the slave node. 
     162 
     163Then check connectivity between the two machines. On both machines, add the 
     164users for the database on the other machine to the $HOME/.pgpass file, like so : 
     165 
     166{{{ 
     167$ cat $HOME/.pgpass 
     168localhost:*:*:postgres:bigBadTruck 
     169localhost:*:*:deckard:origami 
     170the.other.machine.name:5432:bladerunner:deckard:origami 
     171the.other.machine.name:5432:*:postgres:bigBadTruck 
     172}}} 
     173 
     174You should then, on both machines, be able to access the other machine both as the 
     175postgres user and the deckard user, using the bladerunner database, like so :  
     176 
     177{{{ 
     178psql -h the.other.machine.name -U postgres bladerunner 
     179psql -h the.other.machine.name -U deckard bladerunner 
     180}}} 
     181 
     182Making sure those commands work at the linux prompt ensures that the connectivity  
     183between the databases on the two machines is OK. 
     184 
     185So, you now have postgres 10 running on two systems. Both of them have a user named 
     186deckard, with a database named bladerunner, and in that database is an empty table named 
     187expireTimes. Connectivity between the two machines is OK. 
     188 
     189 
     190 
     191=== PART 2 : SETUP ON THE MASTER NODE === 
     192 
     193In order to get some data flowing through the expireTimes table on the master node, write 
     194a script called "doInsert.sh" that looks like this : 
     195 
     196{{{ 
     197#!/bin/bash 
     198t=`date +"%Y/%m/%d %H:%M:%S %B %A %d"` 
     199echo INSERT INTO expireTimes VALUES \(\'$t\'\)\; | psql -Udeckard bladerunner 
     200exit 0 
     201}}} 
     202 
     203Then set executable status on that file : 
     204{{{  
     205$ chmod +x doInsert.sh  
     206}}} 
     207 
     208And similarly write a script called "doDelete.sh" that looks like : 
     209 
     210{{{ 
     211#!/bin/bash 
     212sleep 30; 
     213t=`date --date="1 hour ago" +"%Y/%m/%d %H:%M:%S %B %A %d"` 
     214echo DELETE FROM expireTimes WHERE retirement \< \'$t\'\; | psql -Udeckard bladerunner 
     215exit 0 
     216}}} 
     217 
     218And again set executable status on the file. The first script will insert the current 
     219time (in a somewhat screwball format) into the database, the second script will delete  
     220times that are older than an hour. 
     221 
     222Run them both in cron, running the insert one every minute and the delete one every 
     223ten minutes, like so : 
     224 
     225{{{ 
     226* * * * * /home/noien/replication/setup/doInsert.sh &> /home/noien/replication/setup/doInsert.log 
     227*/10 * * * * /home/noien/replication/setup/doDelete.sh &> /home/noien/replication/setup/doDelete.log 
     228}}} 
     229 
     230Check that the times are going into the table : 
     231 
     232{{{ 
     233$ psql -Udeckard bladerunner 
     234psql (10.3) 
     235Type "help" for help. 
     236 
     237bladerunner=> select retirement from expireTimes order by retirement; 
     238             retirement               
     239------------------------------------- 
     240 2018/04/13 14:51:01 April Friday 13 
     241 2018/04/13 14:52:01 April Friday 13 
     242 2018/04/13 14:53:01 April Friday 13 
     243 2018/04/13 14:54:01 April Friday 13 
     244 2018/04/13 14:55:01 April Friday 13 
     245}}} 
     246 
     247After a while, there should always be between 60 and 70 entries in the table : 
     248 
     249{{{ 
     250bladerunner=> select count(*) from expireTimes; 
     251 count  
     252------- 
     253    60 
     254}}} 
     255 
     256So, this is the little test table that we are going to replicate on the slave node. 
     257We have to "publish" this table on the master node.a Still in the database : 
     258 
     259{{{ bladerunner=> CREATE PUBLICATION harrison FOR TABLE expireTimes; }}} 
     260 
     261 
     262Interestingly, after publishing the delete script will not be able to delete entries 
     263from the table. Attempts to delete will result in an error : 
     264 
     265{{{ 
     266ERROR:  cannot delete from table "expiretimes" because it does not have a replica identity and publishes deletes 
     267HINT:  To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE. 
     268}}} 
     269 
     270So to follow that hint, do the following : 
     271 
     272{{{  
     273bladerunner=> ALTER TABLE expireTimes REPLICA IDENTITY FULL;  
     274}}} 
     275 
     276After which deletes can happen again. I'm not sure that FULL is the optimal setting 
     277for all cases, more reading may be required. 
     278 
     279We should now be able to see the publication on the master node : 
     280 
     281{{{ 
     282bladerunner=> \dRp+ 
     283                Publication harrison 
     284  Owner  | All tables | Inserts | Updates | Deletes  
     285---------+------------+---------+---------+--------- 
     286 deckard | f          | t       | t       | t 
     287Tables: 
     288    "public.expiretimes" 
     289}}} 
     290 
     291 
     292=== PART 3 : SETUP ON THE SLAVE NODE === 
     293 
     294 
     295Subscriptions must be done by the database super user, postgres. So log in to the slave 
     296machine and go to the bladerunner database as postgres with : 
     297 
     298{{{ 
     299 $ psql -Upostgres bladerunner  
     300}}} 
     301 
     302Then create a subscription that connects to the publication we have on the master machine : 
     303 
     304{{{ 
     305 bladerunner=# CREATE SUBSCRIPTION ford CONNECTION 'host=the.master.machine.name dbname=bladerunner user=deckard password=origami' PUBLICATION harrison;  
     306}}} 
     307 
     308Note that you do need to specify the password, the .pgpass file does not affect the CREATE SUBSCRIPTION command. 
     309I'm not sure what would happen if the password on the master node were to change. 
     310 
     311You should then be able to see the subscription : 
     312 
     313{{{ 
     314bladerunner=# \dRs+ 
     315                                                          List of subscriptions 
     316 Name |  Owner   | Enabled | Publication | Synchronous commit |                                 Conninfo                                   
     317------+----------+---------+-------------+--------------------+--------------------------------------------------------------------------- 
     318 ford | postgres | t       | {harrison}  | off                | host=the.master.machine.name dbname=bladerunner user=deckard password=origami 
     319(1 row) 
     320}}} 
     321 
     322And monitor it in more depth with the pg_stat_subscription table : 
     323 
     324{{{ 
     325bladerunner=# SELECT * FROM pg_stat_subscription; 
     326 subid | subname |  pid  | relid | received_lsn |      last_msg_send_time       |     last_msg_receipt_time     | latest_end_lsn |        latest_end_time         
     327-------+---------+-------+-------+--------------+-------------------------------+-------------------------------+----------------+------------------------------- 
     328 16413 | ford    | 22102 |       | 0/199F168    | 2018-04-13 16:57:05.488469-06 | 2018-04-13 16:56:43.719121-06 | 0/199F168      | 2018-04-13 16:57:05.488469-06 
     329}}} 
     330 
     331Similarly on the master machine you should be able to access the database (this time there's no 
     332need to be the postgres user) with : 
     333 
     334{{{ 
     335 $ psql -Udeckard bladerunner  
     336}}} 
     337 
     338And then monitor the other end by looking at the pg_stat_replication table : 
     339 
     340{{{ 
     341bladerunner=> select pid, usesysid, usename, application_name, client_addr, client_hostname, client_port, backend_start from pg_stat_replication; 
     342  pid  | usesysid | usename | application_name | client_addr   |     client_hostname      | client_port |         backend_start          
     343-------+----------+---------+------------------+---------------+--------------------------+-------------+------------------------------- 
     344 31018 |    16384 | deckard | ford             | 100.37.24.123 | the.slave.machine.name   |       32826 | 2018-04-13 16:38:52.349168-06 
     345}}} 
     346 
     347And the two tables on the two machines should then be mirror images of each other, with updates on the master 
     348popping up on the slave (also sometimes called the 'standby') machine. 
     349 
     350NOTE : If a new table is added to the publication on the master, the slave subscription will not see them 
     351by default. In that case you would have to do this on the slave node : 
     352 
     353{{{ 
     354 bladerunner=# ALTER SUBSCRIPTION ford REFRESH PUBLICATION;  
     355}}} 
     356 
     357Niles Oien April 2018.