PostgreSQL High-Availability using Debian
N.B. This is a Works In Progress (WIP) and the instructions are not complete! Bear in mind, I am running by the seat of my pants on this one and it's only something we have recently begun to use. I more than likely have overlooked something and will update this page as and when I am bitten by my own stupidity.
At my workplace I needed to configure a warm-standby for an SQL backend to our FreeRADIUS installation. Digging around online I could not find any definitive sets of articles on how to do this sort of thing so I decided to document up our deployment based on Debian 'lenny'.
At first I wanted to try and get Warm Standby (Point-In-Time-Recovery) working but this proved to be rather tricky so I went for the much simpler drbd (aka RAID1-over-TCP)
This document will cover how to deal with priming:
Important Notes
In an ideal world you would have two seperate VLAN's for each of your PostgreSQL boxes to sit in and a serial cable between the two to act as an out-of-bound (not over the network) heartbeat. The PostgreSQL servers would then provide an active/standby service on a unique IP address.
There are two problems with this:
- for a serial cable to be used the boxes must physically be near to one another
- configuring routing in your network so that the same unique IP is available in two VLAN's and 'heartbeat' interacts with your routing to provision this is non-trivial
For our needs, the boxes could not be in the same location 1 but fortunately we have a rather overly resilent network that can give high-availability for our VLAN's so we have compromised.
Our setup is thus a single VLAN with the network routing statically configured to route the PostgreSQL service '/32' IP into that VLAN. We use HSRP for a HA Layer-3 default gateway for the VLAN, you might prefer something else. The example I describe below makes this unique IP lurk in the same VLAN as the PostgreSQL boxes lurk in, for us that is not the case but it does not really matter.
Naming and Addressing
Throughout this document I will be using the following conventions.
The network we will be working with is '10.0.0.0/24' with the default gateway lurking at '10.0.0.1'. The HA PostgreSQL service address is '10.0.0.4' ('sql.example.com') and it is provisioned by the two real PostgreSQL servers:
10.0.0.2 - 'sql1.example.com'
10.0.0.3 - 'sql2.example.com'
Both boxes use link aggregation to provide a resilient uplink to the switch they are connected to, this is why throughout the document you will see 'bond0' referred to where you would normally see 'eth0'.
Instructions
Prerequisites
So to start off by installing the packages we need:
# aptitude install postgresql libpam_ldap heartbeat drbd8-utils drbd8-modules
Configuring LDAP Authentication
Make sure you can use your system wide LDAP library ('ldapsearch' for example) to make TLS/SSL requests with certificate verification working. If it does not, you will see when running tcpdump/tshark SSL "Encrypted Alert"'s terminating the LDAP communications appearing in your packet captures which means you have not properly set this up. Also if you do not do this then you will find your passwords shooting across around your network in plaintext.
# grep -v "^#" /etc/pam_ldap.conf host ldap1.example.com ldap2.example.com base dc=example,dc=com ldap_version 3 scope sub bind_policy soft pam_filter objectClass=inetOrgPerson pam_login_attribute cn ssl start_tls
# cat /etc/pam.d/postgresql auth required pam_ldap.so account required pam_ldap.so
# cat /etc/postgresql/8.3/main/pg_hba.conf [snipped] host all all 127.0.0.1/32 pam postgresql host all all ::1/128 pam postgresql host all all <your-cidr> pam postgresql
# su - postgres $ createuser <your-username> Shall the new role be a superuser? (y/n) y
You should now be able to login using your usual LDAP backend'ed credentials.
Disable Auto Starting of PGSQL
Stop and configure PostgreSQL to only run with 'manual' intervention (ie. not via the 'init.d' scripts):
# /etc/init.d/postgresql-8.3 stop # grep -v "^#" /etc/postgresql/8.3/main/start.conf manual
Configure Heartbeat
On both boxes identically configure the following files as shown:
# grep -v "^#" /etc/heartbeat/ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 deadtime 45 mcast bond0 225.0.0.1 694 1 0 auto_failback off node sql1 node sql2 ping 10.0.0.1
N.B. we use a 'deadtime' of 45 seconds as the settle down time of spanning-tree is about 30 seconds so we want to avoid the nasty case of where there is a spanning-tree juggle at which point the failover gets kicked about.
N.B. here I suggest 'auto_failback' is turned off so to reduce the downtime of your PostgreSQL instance as it moves back to the 'primary'. If you have more than one HA service you want to split between the two boxes (we tether syslog-ng to the other box and have the two HA services running over the same heartbeat) then you might want to turn this on to act as a very simple load-balancer. This means the load produced by postgresql will not affect any service tethered to the other box.
You now need to download my OCF script PgSQL and place it in '/etc/heartbeat/resource.d/', remembering to make it executable.
# grep -v "^#" /etc/heartbeat/haresources sql1 drbddisk::r0 Filesystem::/dev/drbd0::/var/lib/postgresql::ext3::relatime,nodev,nosuid,noexec PgSQL::5432 IPaddr::10.0.0.4/24/bond0
N.B. you might be curious why only 'sql1' is referred to in 'haresources'. This is because 'sql1' is the typical primary and so if the heartbeat daemons are running on both 'sql1' and 'sql2' then 'sql1' is the box that actually has the IP '10.0.0.4' and PostgreSQL running. If both are referred to, then bad things happen!
# grep -v "^#" /etc/heartbeat/authkeys auth 1 1 sha1 <big-old-hunky-random-secret>
DRBD
Before I go on to explain how to set this up, you need to be aware that the 'postgres' UID and GID must be identical on both your SQL servers. If it is not then PostgreSQL on the secondary/passive box will not startup as it will not have permission to write to the filesystem or read it's database files. If they currently mis-match I recommend you change both to new numbers by editing your '/etc/passwd' and '/etc/group' files directly and then run:
# find / -uid <old-uid-number> -exec chown <new-uid-number> '{}' \;
# find / -gid <old-gid-number> -exec chgrp <new-gid-number> '{}' \;You need to tweak your '/etc/drbd.conf' file appropriately (to be identical on both boxes):
# grep -v "^#" /etc/drbd.conf
global {
# out SQL servers do not have direct Internet access
usage-count no;
}
common {
# disks cap out at 50ish for us
syncer { rate 40M; }
}
resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
}
startup {
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
fencing resource-only;
}
net {
cram-hmac-alg "sha1";
shared-secret "<big honking secret here>";
# these are worth changing, I have not decided what to though
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
on sql1 {
device /dev/drbd0;
disk /dev/lvm-sql1/drbd-pgsql;
address 10.0.0.2:7788;
flexible-meta-disk internal;
}
on sql2 {
device /dev/drbd0;
disk /dev/lvm-sql2/drbd-pgsql;
address 10.0.0.3:7788;
flexible-meta-disk internal;
}
} So DRDB can use the existing heartbeat infrastructure between your two SQL servers. Type on both:
# chgrp haclient /sbin/drbdsetup # chmod o-x /sbin/drbdsetup # chmod u+s /sbin/drbdsetup # chgrp haclient /sbin/drbdmeta # chmod o-x /sbin/drbdmeta # chmod u+s /sbin/drbdmeta # chgrp haclient /etc/drbd.conf # chmod g+r /etc/drbd.conf
Now we need to prime the DRDB bits and move the postgresql data spool over, on both type:
# /etc/init.d/heartbeat stop # /etc/init.d/postgresql-8.3 stop # drbdadm create-md r0 # /etc/init.d/drbd start # pg_dropcluster 8.3 main # echo " > # drbd so is *not* automounted > /dev/drbd0 /var/lib/postgresql ext3 noauto,relatime,nodev,nosuid,noexec 0 0" >> /etc/fstab
You need to tweak postgresql to listen on all interfaces on your boxes as the IP address is added to the system after postgresql is started, and more importantly removed before it's shutdown. So on both systems:
# grep listen_address /etc/postgresql/8.3/main/postgresql.conf listen_addresses = '10.0.0.4'
You should now have a '/dev/drbd0' device that you can treat like a regular block device. So, create a filesystem on it by typing on 'sql1':
# drbdadm -- --overwrite-data-of-peer primary r0 # mkfs.ext3 -L pgsql /dev/drbd0 # mount /var/lib/postgresql # pg_createcluster 8.3 main # ln -s /etc/postgresql/8.3/main/postgresql.conf /var/lib/postgresql/8.3/main/postgresql.conf # umount /var/lib/postgresql
Now hopefully, if everything is correctly set up (and I have not forgotten to document anything) you should be able to fire up the heartbeat daemon on both 'sql1' and 'sql2' then after about a minute you should have a pgsql instance running on '10.0.0.4'.
If things do not work, the only difference between the two boxes really should be just in the '/etc/heartbeat/haresources' file. You might want to check all configuration files are identical on both boxes before grumbling at me.
Links
although in central London it seems my organisation is unable to provision a reliable 220V feed :-/ (1)