Wednesday, December 21, 2011

Postgresql - streaming replication, trigger file vs. repmgr 'promote' command

The trigger file is a file  that get's created ( it's up to us when) when you want a failover to happen (i.e. to make the streaming replication stopped), hence, the slave that 'discovers' the trigger file becomes a master or independent node.

Such file location is specified in the recovery.conf configuration file.

# Specifies a trigger file whose presence should cause streaming replication to
# end (i.e., failover).
trigger_file = '/path_to/trigger'

I suppose the server continuously monitors for the presence of such file.

In the example I'm following both master and slave run on the same machine on different ports, so

the failover script creates the trigger file locally, so as I'm running the master on one machine and the slave on another, I'll need to create the trigger file remotely to one of the slaves.

One question remains: ¿how to tell repmgr where to look for the trigger file?

Well, it turns out to not being necessary as a "promote" command is available in repmr."Such command is designed to be more efficient"

    On Fri, Aug 12, 2011 at 11:35 PM, Joshua Sierles> wrote: > Is there an advantage to using repmgr's 'promote' command on the slave since it requires a restart? What's the difference from using the 'trigger file' method?
    repmgr follow is designed to be fast. That requires you to use repmgr
     promote to make that work correctly.
     The trigger file works, but will cause problems when running multiple standbys.

     Simon Riggs         
     PostgreSQL Development, 24x7 Support, Training & Services

Other references:

The gotcha in my implementation is that I'm using repmgr to abstract the streaming replication  configuration , in my opinion this reduces administration complexity but the tradeoff is a more elaborated initial deployment.

So, the failover script reduces itself to:

# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %H = new master node host name
#                  %P = old primary node id
#                  %% = '%' character


cmd=`ssh $standby_node "/var/lib/postgresql/repmgr -f /var/lib/postgresql/repmgr/repmgr.conf standby promote"`
if [ $failed_node_id = $old_primary_node_id ];then      # master failed
# the standby server is promoted
echo $cmd

Now it's time to kill the master.

No comments:

Post a Comment