Setting up postgres Replication
I have a need to set up a Postgres Replication
Database. In Postgres 9.0+ they have the ability to have a read only
replicated database that mirrors the live database, it can lag slightly behind
the live database (although in my simple test it was spot on). The Replication DB can be read from but not
written to. In my particular case I just
want to use it as a failover DB that can be brought up as a live DB if ever
needed.
Here is my set up. I
have two Ubuntu 10.04 LTS 64 bit servers with postgresl 9.1 installed. I wrote an article that shows how to install postgres 9.1 on Ubuntu 10.04 (LTS). One of these servers
will be the primary (master) postgres server and the other will be the replication
(slave) postgres server. I have a large
postgres database already north of 100 GiB this is my Primary (master)
server. This is a live server, I can
reboot it, but it cannot be down for an hour or more. This is a test procedure I came up with to
test first before I do it on my live server.
In this procedure I have a copy of my live database and I create a test
database to confirm the replication is working.
Set up a test Database
To do a full test I am going to log into my PRIMARY server
and create a new database. The purpose
of the database is for testing. You can
skip this test if you want to.
Information on how to create a DB can be found at http://www.cyberciti.biz/faq/howto-add-postgresql-user-account/
[2]
Log into the DB
> sudo su postgres
> psql -d postgres -U postgres
|
Create a test user
> CREATE USER patman WITH PASSWORD ‘myPassword’;
|
Create the Database
> CREATE DATBASE testdb;
|
Make sure it was created by running \l
Grant patman privileges to this database
> GRANT ALL PRIVILEGES ON DATABASE testdb to
patman;
|
Now let’s test it quit out of postgres and try
> \q
> psql -d testdb -U patman
|
And I get an error
I am trying to log in as patman but I am logged in as the
postgres user. According to this site http://blog.deliciousrobots.com/2011/12/13/get-postgres-working-on-ubuntu-or-linux-mint/
[3] it’s a setting in /etc/postgresql/9.1/main/pg_hba.conf I failed to edit
edit the file
> sudo vi +90 /etc/postgresql/9.1/main/pg_hba.conf
|
Change
# "local"
is for Unix domain socket connections only
local all all peer
|
To
# "local"
is for Unix domain socket connections only
local all all md5
|
And restart postgresql
> sudo /etc/init.d/postgres restart
|
And then log back in
> psql -d testdb -U patman
|
This time it worked!
Now connect to the database.
> \c testdb
|
Create a simple table that I can send updates to. The purpose of this table is to test the set-up
of the replication server. The live
server should be able to continue getting updates while the replication server
is being set up, then it will “catch up”
> CREATE TABLE data(
> id
serial primary key not null,
> time
timestamp not null default CURRENT_TIMESTAMP,
> number
integer not null
> );
|
This creates a table with a primary key that auto
increments, and a time field which will default to the current_timestamp
Run the following command to confirm the table has been
created and is working as intended.
> \dt
> select * from data;
> INSERT INTO data (number)
> VALUES (34);
> select * from data;
|
OK now that I have set up I need to create a simple program
to insert data into this table on some kind of loop.
Writing a simple python program
I decided to write a program in python to insert to this
database. I found a good how to for
python and postgres at http://zetcode.com/db/postgresqlpythontutorial/
[4], in fact it is very detailed and easy to read, I would recommend it go
check it out.
First I had to install some libraries that python depends on
for talking to postgresql
> sudo apt-get install python-psycopg2
> vi insertDB.py
|
Here is my python code
#!/usr/bin/python
#
#
# Simple Script to insert data
#
import psycopg2
import sys
import time
con = None
try:
# First get the current maximum
con = psycopg2.connect(database='testdb',
user='patman', password='myPassword')
cur = con.cursor()
cur.execute('select MAX(number) from
data')
x = cur.fetchone()[0]
if x is None:
x = 0
while (True):
x = x + 1
cur.execute('INSERT INTO data(number)
VALUES (' + str(x) + ')')
con.commit()
print 'inserting ' + str(x)
time.sleep(1)
except
psycopg2.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
|
It will connect, locally to my postgresql data base and
query the data table to find the largest integer in the number field. The program then increments that by one and inserts
the new values (always pausing 1 second and incrementing by 1).
The idea is this is constantly adding data to the database,
so that I can confirm that the replication I create works correctly.
WAL (write ahead log) Files
Postgres uses write ahead logging
“WAL's central concept is that changes to data files (where tables and
indexes reside) must be written only after those changes have been logged, that
is, after log records describing the changes have been flushed to permanent
storage.”
First lets go find the WAL files. In my case I moved my database to another drive
/database/postgresqldata/ this is set in the
/etc/postgresql/9.1/main/postgresql.conf file under the data_directory
field. In 9.1 its typically
/varlib/postgresql/9.1/main/. But in my
case I changed it.
At any rate the WAL files are located in the pg_xlog folder
within the DB folder
> cd /database/postgresqldata/pg_xlog
> ls -alh
|
Here you should see several files that are 16MiB in size
with numbers for names.
These are the WAL files.
In the /etc/postgresql/9.1/main/postgresql.conf file there is a
wal_keep_segments field which is typically set to 0. If it is set to 0 it means that at least 0
WAL files must be kept in the queue.
More typically are here as the DB needed. This setting must be increased to assure
that the replication database has access to them.
I have a large Database and I need to copy it over to my
replication database which may take a day.
This being the case it’s my understanding that I would need a day worth
of WAL files so that the replication DB could catch up.
Having said all that let’s get to configuring the Master
(Main) DB setting for Replication Server.
Primary (Master) DB settings
First lets set up the master DB to be able to allow
replications DBs to get the updates they need.
A few good sites I found some of this information at are http://blog.railsupgrade.com/2011/02/streaming-replication-in-postgresql-91.html
[6] and http://jayant7k.blogspot.com/2011/12/postgresql-replication.html
[7].
postgresql.conf
First we need to edit
/etc/postgresql/9.1/main/postgresql.conf
> sudo vi
/etc/postgresql/9.1/main/postgresql.conf
|
The following needs to be edited (the @line is to show where
the typical line number is for this setting)
@line 61
listen_addresses =
'*' # what IP address(es) to
listen on;
@line 155
wal_level = hot_standby # minimal, archive, or hot_standby
@line 198
max_wal_senders = 5
# max number of
walsender processes
@line 201
wal_keep_segments = 128 # in logfile segments, 16MB each;
|
Here is the “why” for each as best as I have researched.
listen_addresses = '*'
Listen to any incoming connection.
wal_level = hot_standby
According to http://www.postgresql.org/docs/9.1/static/runtime-config-wal.html#GUC-WAL-LEVEL
[8] This must be set to archive or
hot_standy . The default minimal does
not put enough information in the WAL files to truly reconstruct the Database.
max_wal_senders = 5
Sets the number of concurrent streaming backup
connections. I think you could get by
with 1 if you only have 1 failover server, I just set to 5 because all the
other cool kids were doing it.
wal_keep_segments = 128
This will keep at a minimum 128 WAL files. At the default size of 16MiB that totals 2
GiB. This is a lot but I want to make
sure I do not lose anything on the transfer.
pga_hba.conf
Now edit /etc/postgresql/9.1/main/pga_hba.conf
> sudo vi
/etc/postgresql/9.1/main/pg_hba.conf
|
Add this line to the bottom
host replication postgres 192.168.0.250/32 trust
|
This just says to trust the server at 192.168.0.250, which
is my replication server.
Replication Role
I made a mistake when I first did this I did not realize
there is a replication role in the database that must be designated.
When I simply tried to use the postgres user I got the
following error.
2012-04-21 08:26:01 MDT FATAL: could not connect to the primary server:
FATAL: must be replication role to start
walsender
When I hit this error I found a lot of information on it
here http://postgresql.1045698.n5.nabble.com/Streaming-Replication-woes-td4964323.html
[11]
I am using the postgres user as the replication user. To see what the postgres user has run the
following command from the postgres database.
> \du
|
Here you can see that my postgres user does not have the “Replication”
Role.
So I added the role to my postgres user, most sites I found
suggest creating a new user that only handles replication. I decided to just go with reusing the
postgres user.
From the postgres database run this command
> ALTER ROLE postgres WITH REPLICATION;
|
Now run the \du command again
> \du postgres
|
And you should see this
Now it has the Replication role!
Restart the Primary database
From the command line
> sudo /etc/init.d/postgresql restart
|
After it has been restarted I kick off my python
script.
> sudo /etc/init.d/postgresql restart
|
I left it running in its own terminal.
Then I logged into the Database and did a few quick checks
to make sure that data is being constantly loaded into the database.
> psql -d testdb –U patman
|
Then from psql
> select count(*) from data;
|
I ran this a few times to confirm its growing in size.
Copy the Database over to the Replication (Slave) server
My first go at this I screwed it up! So you can learn from my mistake. I thought I could do a simple pg_dumpall from
the primary server to the replication server.
I logged into the Replication server and ran these commands.
> sudo su postgres
> time pg_dumpall -h 192.168.0.240 -U postgres |
psql -d postgres -U postgres
|
(I added the time command to see how long it runs)
And then after setting up the replication server and
restarting it (which I will get into later) the postgres would not start up and
I got this error.
FATAL: hot standby is not possible because wal_level
was not set to "hot_standby" on the master server
But I did set it! So
what is going on?
Well looking around this I found http://postgresql.1045698.n5.nabble.com/Hot-standby-server-does-not-start-td2976362.html
[10]. Turns out you can’t simply use
the pg_dumpall to copy the database if you want it to be a replication database.
So here is the correct way of doing it.
First shut off postgres on the replication server
> sudo su /etc/init.d/postgresql stop
|
Next log into the primary server and run the following
commands
> sudo su postgres
> psql -d postgres -U postgres
|
You are logged into the database
Now tell the database you are going to start a backup. You can still keep using your database as you
normally would, there will be no interruption to incoming data.
> SELECT
pg_start_backup('mybackup_label', true);
|
From the command line copy your data_directory folder from
the primary server to the replication server.
In my case the data_directory is “/database/postgresqldata/” (this is
set in the postgresql.conf file). I kept
the data_directory the same on both primary and replication server.
Here is my command adjust it according to where you
data_directory is located.
> time
scp -r /database/postgresqldata/*
patman@192.168.0.250:/database/postgresqldata/
|
In my particular case this took 6 hours to complete, your
mileage will vary
After the database has been copied over run the following
command from the psql, this will say the backup is done.
> SELECT
pg_stop_backup();
|
I got this error. “NOTICE: WAL archiving is not enabled; you must ensure
that all required WAL segments are copied through other means to complete the
backup”
I did not set up WAL archiving but I did set up wal_keep_segments,
I think this is only a problem if you do not have enough wal segments. I think I am OK. Let’s see…
Replication (Slave) DB server settings
Now that we have backed up the database to the slave/replication
database, we need to change it to an actual replication database.
postgresql.conf
First we need to edit /etc/postgresql/9.1/main/postgresql.conf
> sudo vi
/etc/postgresql/9.1/main/postgresql.conf
|
The following needs to be edited (the @line is to show where
the typical line number is for this setting)
@line 59
listen_addresses =
'*' # what IP address(es) to
listen on;
@line 210
hot_standby =
on #
"on" allows queries during recovery
|
Here is the “why” for each as best as I have researched.
listen_addresses
= '*'
Listen to any incoming connection.
hot_standy = on
Allows you to query, but not update the replication DB
recovery.conf
Now the recovery.conf file must be created. I am using Ubuntu 10.04 with a postgres 9.1
custom install and I changed the data_directory. According to this site http://dba.stackexchange.com/questions/3387/postgresql-on-ubuntu-using-conf-files-for-recovery
[9] the recovery.conf file
needs to be in the data_directory, as defined in postgresql.conf. In my case that is “/database/postgresqldata/”
So I copied the /usr/local/pgsql/data/recovery.conf.sample
file to /database/postgresqldata/”recovery.conf and open it for editing.
> cd /usr/share/postgresql/9.1/
> cp recovery.conf.sample /database/postgresqldata/recovery.conf
> sudo vi /database/postgresqldata/recovery.conf
|
The following needs to be edited (the @line is to show where
the typical line number is for this setting)
@line 108
standby_mode = on
@line 110
primary_conninfo
='host=192.168.0.240 port=5432 user=postgres'
@line 124
trigger_file = '/home/postgres/failover'
|
Here is an explanation of each of these
standby_mode = on
Just sets the standby mode to on
primary_conninfo
All the information for the replication(slave) server to
connect to the primary (master) server.
trigger_file
if this file exists the server will stop being a replication
server and start being a primary server.
It checks periodically for this file.
Fix ownership of files.
> sudo
chown -R postgres:postgres /database/postgresqldata
> sudo chmod 700 /database/postgresqldata/
|
Now start the postgres database on the replication (slave) server
> sudo /etc/init.d/postgresql start
|
Now log into the database
> sudo su postgres
> psql -d postgres –U postgres
|
Then from postgres I ran the following commands
> \c testdb
> select count(*) from data;
> select pg_last_xlog_receive_location();
|
And I can see that it is getting live data! So that is it you now have a replication
server.
Further Testing
Quering recovery process
If you have a live database that processes a lot of data it
may take some time for the replication server to “catch up” There are a few ways to see where it is in
the process.
From the slave database server log in as root and run the
following command
> ps -AF | grep post
|
Here is a partial screen shot of what it returns on my box
Here you can see the wal file it is currently
recovering. You can compare that to the
master database’s pg_xlog folder where the WAl files are located to see how far
the replication server is behind. For
some odd reason my particular live database is staying 35ish behind. This is not typical and I am trying to find
out why. I can stop my replication
server and fall 200 or more behind and when I turn it back on it catches up to
35 behind?? But that is my problem not
yours J
Here are two more ways to query the database to see what WAL
file it is working on .
From the Master
database you can run the following command
> psql -c "SELECT
pg_current_xlog_location()"
|
Which returns something like this
So how do you read this?
If you look in the pg_xlog folder on the main server you
would see files named like this
0000000100000E8C00000095
0000000100000E8C00000096
0000000100000E8C00000097
0000000100000E8C00000098
0000000100000E8C00000099
0000000100000E8C0000009A
Now lets deconstruct the E8C/987975C0
E8C/987975C0
0000000100000E8C00000098
That is how the numbers line up and the rest is a HEX offset
within the file (Where within the file it is reading)
Now from the Slave
database you can run the following command. Special thanks to Ivan
Stoykov on LinkedIn for clearing this one up for me.
> psql -c "SELECT pg_last_xlog_replay_location()"
|
Which returns something like this
Primary stop/start
As some further testing I stopped the primary database and
started it back up after a few minutes. (I also stopped and restarted my python
program)
> sudo /etc/init.d/postgresql stop
> sleep 120
> sudo /etc/init.d/postgresql start
> ./insertDB.py
|
The replication database had no problem with this
Replication stop/start
From the replication server I stopped and restarted the
postgres server.
> sudo /etc/init.d/postgresql stop
> sleep 240
> sudo /etc/init.d/postgresql start
|
No problem with this either
Test Delete from Replication
The Replication server only has read access, let’s try and
delete.
From the database on the replication server try and run this
> \c testdb
> delete from data;
|
And that did not work, which is how it should be.
Change the Replication server to a primary server
Warning do not do
this in a live system!!! I just did this to make sure it works…
The recover.conf file defined /home/postgres/failover as the trigger_file. Create a file here and see if it stops being
a replication server.
Run the following
commands
> sudo mkdir -p /home/postgres
> touch /home/postgres/failover
|
Now when I log into the database from the replication server
and run the following commands.
> \c testdb
> select count(*) from data;
|
I see that it is no longer getting fed from the primary
database server.
And I can run the following command
> delete from data;
|
And it can now write to the database.
Also when this occurs recovery.conf was renamed to recovery.done.
Now that the Replication server has become a Primary server
you can’t easily switch it back to a Replication server. Which is as it should be!
References
[1] CREATE DATABASE
(postgres 9.1 manual)
Visited 4/2012
[2] PostgreSQL add or create a
user account and grant permission for database
Visited 4/2012
[3] GET POSTGRES WORKING ON
UBUNTU OR LINUX MINT
Visited 4/2012
[4] PostgreSQL Python
tutorial
Visited 4/2012
[5] Write-Ahead
Logging (WAL) postgres 9.1 manual
Visited 4/2012
[6] Streaming
replication in postgresql 9.1
Visited 4/2012
[7] Postgresql
replication
Visited 4/2012
[8] wal_level
Visited 4/2012
[9] PostgreSQL on
Ubuntu
Visited 4/2012
[10] Hot standby
server does not start
Visited 4/2012
[11] Streaming
Replication woes
Visited 4/2012
a very long post.. I need to learn a lot.. :)
ReplyDeleteHello,
ReplyDeleteCan we do the same thing in Windows Server and If yes how.
Just a pointer to the solution will be greatly appreciated.
Thanks!
Excellent post. I liked your leaving in your mistakes and dead ends so we can learn from them. Thanks loads! You have a new fan.
ReplyDeleteThanks, On short to the point posts I usually remove my mistakes, but something complex like this where you can go down odd paths and get lost I like to leave them in, mostly because we search on those error codes and I find it nice when someone posts the errors so I can find their page when searching.
DeleteAt any rate glad you liked the post