Tuesday, December 27, 2011

Postgresql - pg_basebackup - forget about rsync ?

What's new in PostgreSQL 9.1
http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.1

pg_basebackup
http://www.postgresql.org/docs/9.1/static/app-pgbasebackup.html

this is amazing stuff!

pg_basebackup is used to take base backups of a running PostgreSQL database cluster. These are taken without affecting other clients to the database, and can be used both for point-in-time recovery (see Section 24.3) and as the starting point for a log shipping or streaming replication standby servers (see Section 25.2).
pg_basebackup makes a binary copy of the database cluster files, while making sure the system is automatically put in and out of backup mode automatically.
The backup is made over a regular PostgreSQL connection, and uses the replication protocol. The connection must be made with a user having REPLICATION permissions (see Section 20.2), and the user must be granted explicit permissions in pg_hba.conf. The server must also be configured with max_wal_senders set high enough to leave at least one session available for the backup.

Then let's make some dummy example:
+Let's have a server configured with the following options:


listen_addresses='*'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .' # we can also use exit 0, anything that
# just does nothing
max_wal_senders = 10
wal_keep_segments = 5000 # 80 GB required on pg_xlog
hot_standby = on

Basic replication options taken from
https://github.com/greg2ndQuadrant/repmgr

1. Creating a database with data:

postgres@linux-2561:~> createdb pgbenchdb

postgres@linux-2561:~> pgbench -i -F 10 -s 10 -h 10.0.0.113 -p 5432 pgbenchdb
pgbenchdb=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | pgbench_accounts | table | postgres | 1302 MB |
public | pgbench_branches | table | postgres | 40 kB |
public | pgbench_history | table | postgres | 0 bytes |
public | pgbench_tellers | table | postgres | 80 kB |
(4 rows)


pgbenchdb=# SELECT pg_size_pretty(pg_database_size('pgbenchdb'));
pg_size_pretty
----------------
1326 MB
(1 row)


2. Make a base backup on the standby node:

postgres@debian1:~$ time pg_basebackup --pgdata=/var/lib/postgresql/data_sr --xlog --progress --verbose --host=10.0.113 --port=5432 
xlog start point: 0/1B000020
1392758/1392758 kB (100%), 1/1 tablespace
xlog end point: 0/1B0000C4
pg_basebackup: base backup completed

real 0m36.909s
user 0m0.740s
sys 0m11.565s


Not bad for a 1.3GB database !

3. Set the backup server to behave as a hot standby streaming replication server, by creating a recovery.conf file in $PGDATA.

standby_mode = 'on'
primary_conninfo = 'host=10.0.0.113 port=5432 user=postgres password=pgsql'


4. Startup the standby server

pg_ctl start

5. Query on the master the replication status:
pgbenchdb=# \x 
Expanded display is on.
pgbenchdb=# select * from pg_stat_replication;
-[ RECORD 1 ]----+-----------------------------
procpid | 7776
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 10.0.0.201
client_hostname |
client_port | 56744
backend_start | 2011-12-27 17:23:57.15636-06
state | streaming
sent_location | 0/1E000120
write_location | 0/1E000120
flush_location | 0/1E000120
replay_location | 0/1E000120
sync_priority | 0
sync_state | async


6. Make some changes on the master
postgres@linux-2561:~> pgbench -c 50 -d -j 2 -t 100 -h 10.0.0.113 pgbenchdb
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 50
number of threads: 2
number of transactions per client: 100
number of transactions actually processed: 5000/5000
tps = 89.336188 (including connections establishing)
tps = 89.461080 (excluding connections establishing)

pgbenchdb=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | pgbench_accounts | table | postgres | 1302 MB |
public | pgbench_branches | table | postgres | 168 kB |
public | pgbench_history | table | postgres | 792 kB |
public | pgbench_tellers | table | postgres | 248 kB |
(4 rows)


6. Log in to the standby and see if changes are there (in a very lazy fashion).
postgres=# \c pgbenchdb
You are now connected to database "pgbenchdb" as user "postgres".
pgbenchdb=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | pgbench_accounts | table | postgres | 1302 MB |
public | pgbench_branches | table | postgres | 168 kB |
public | pgbench_history | table | postgres | 792 kB |
public | pgbench_tellers | table | postgres | 248 kB |
(4 rows)

pgbenchdb=# insert into pgbench_history (tid) values (generate_series(1,1000));
ERROR: cannot execute INSERT in a read-only transaction

Other references:

http://postgresql.ru.net/manual/pgbench.html

http://www.postgresql.org/docs/9.1/static/warm-standby.html
http://www.postgresql.org/docs/8.1/interactive/backup-online.html
http://wiki.postgresql.org/wiki/Streaming_Replication

Then, what's rsync? =-)

TODO:
What the hell are these errors about?

postgres@debian1:~/data_sr/pg_log$ tail -f postgresql-2011-12-27_172357.log 
2011-12-27 17:54:45 CST postgres postgres STATEMENT: SELECT pg_current_xlog_location()
2011-12-27 17:54:55 CST postgres postgres ERROR: recovery is in progress
2011-12-27 17:54:55 CST postgres postgres HINT: WAL control functions cannot be executed during recovery.
2011-12-27 17:54:55 CST postgres postgres STATEMENT: SELECT pg_current_xlog_location()
2011-12-27 17:55:05 CST postgres postgres ERROR: recovery is in progress
2011-12-27 17:55:05 CST postgres postgres HINT: WAL control functions cannot be executed during recovery.
2011-12-27 17:55:05 CST postgres postgres STATEMENT: SELECT pg_current_xlog_location()
2011-12-27 17:55:15 CST postgres postgres ERROR: recovery is in progress
2011-12-27 17:55:15 CST postgres postgres HINT: WAL control functions cannot be executed during recovery.
2011-12-27 17:55:15 CST postgres postgres STATEMENT: SELECT pg_current_xlog_location()

1 comment:

  1. Hi,
    I am trying to create a HA Pgsql server. Is there a way to automate the running of pg_basebackup command on the recovered server to copy latest data from running server.
    Regards

    ReplyDelete