Poor Man's Cisco SLB RADIUS
Cisco IOS supports, if you have forked out for the licence, software based server load-balancing so that you can have a single virtual IP for various services and your core boxes (that are usually idling away doing nothing) can add some resilience and load-balancing to the mix without needing extra hardware.
The limitation of the, IOS based one at least, system is that the probes available are rather limited; really just ICMP echo-request's, HTTP GET and DNS lookups. If you want to load balance your RADIUS server you will have to rely on ping's which really mean nothing in the bigger picture.
Whilst idlely tidying up the configuration on our core boxes I stumbled on the 'custom udp' probe type and realised that by using the new Status-Server RADIUS draft I could cobble something rather nice together, the instructions below explain how to do RADIUS based load-balancing with Cisco's IOS SLB functionality.
N.B. Although the focus of this is around Debian Linux running FreeRADIUS at the real server node end, as long as your RADIUS server supports 'Status-Server' the following method will work. Do not worry if it does not, the method can be adapted to do a full regular authentication request but will have the downside of logging all the polling traffic which Server-Status is meant to avoid.
How To Pull It Off
Well...you will need:
- a copy of tcpdump to capture the RADIUS probe traffic
a copy of wireshark to analyse it
- a RADIUS server that supports Status-Server, FreeRADIUS will be the focus of the example here (versions 1.1.7 and 2.x have been tested)
- a RADIUS client that can send Status-Server requests, such as 'radclient' that comes with FreeRADIUS
If you configure FreeRADIUS to process 'Status-Server' requests (typically all you need to do is set 'status_server = yes' in your main radiusd.conf file) and also to prime the clients.conf file to have an entry for the 'priming' host (I used one of the RADIUS servers themselves) and duplicate the secret too for the querying Cisco boxes.
Run 'tcpdump' on your priming box with something like:
# tcpdump -i any -n -p -s 0 -w /tmp/dump host 1.2.3.10 and port 1812
Whilst that is running, execute the following command on the same box:
$ echo "Message-Authenticator = 0" | radclient -x 1.2.3.10 status <SECRET> Sending Status-Server of id 28 to 1.2.3.10 port 1812 Message-Authenticator = 0x00000000000000000000000000000000 rad_recv: Access-Accept packet from host 1.2.3.10 port 1812, id=28, length=20
Once executed (you should get an 'Access-Accept' back as shown above) kill the packet capture and copy the capture to your local workstation for analysis.
Open up the 'dump' file from the 'priming' box into wireshark and hopefully your capture will have only two packets, you might need to filter out things if you are doing this on a live RADIUS server.
Highlight the 'Radius Protocol' line in the packet break down window, this should highlight only the RADIUS payload in the UDP packet in the hex display at the bottom of your wireshark window (should begin with 0x0c). Right-click on what you have highlighted, a menu will open, go to 'Copy' and select 'Bytes (Hex Stream)'. What is now in your clipboard is your raw hex UDP payload.
In the response packet if you do the same as above you should see it begins with 0x02, this indicates 'Access-Accept'. We only care about the second byte so jot this down on a bit of paper.
From the example given below, what is in your clipboard should go in the RADIUS probes 'request data 0 0C .. .. ...' bit whilst the response ID you jotted down should be put in the 'response 0 data 0 02 XX' (replacing the 'XX'). We only use the first two bytes for the response as your RADIUS server might add various attributes which could cause confusion and make the Message-Authenticator vary; the second byte is the RADIUS message ID that you sent out in the initial request (you should notice that it matches the byte sent out in the request).
That's it, you should have a working Status-Server RADIUS probe. If it is all working remember to remove the 'priming' RADIUS client from your clients.conf file as it is no longer needed.
Example
Debian Linux End
In your '/etc/network/interfaces' file you should have something like the following on each of your FreeRADIUS nodes:
auto lo iface lo inet loopback up ip addr add 123.123.123.123/32 dev lo down ip addr delete 123.123.123.123/32 dev lo # fixes the ARP announcements correctly up sysctl -q net.ipv4.conf.default.arp_ignore=1 up sysctl -q net.ipv4.conf.all.arp_ignore=1 up sysctl -q net.ipv4.conf.default.arp_announce=1 up sysctl -q net.ipv4.conf.all.arp_announce=1 auto bond0 iface bond0 inet static slaves eth0 eth1 # your second node would have the IP 1.2.3.11 address 1.2.3.10 netmask 255.255.255.0 gateway 1.2.3.1
Cisco 6509's End
We use both the ping and RADIUS probes as they apply in an 'AND' combination, so if either probe fails the server node is taken out of service. The two SLB routers (123.123.123.254 and 123.123.123.253) act as anycast boxes provisioning 123.123.123.123 as your SLB RADIUS server; this is nice as it's an active-active setup too.
ip slb probe PING ping
!
ip slb probe RADIUS custom udp
port 1812
request data 0 0C .. .. ...
response 0 data 0 02 XX
interval 5
timeout 5
!
ip slb serverfarm RADIUS
failaction radius reassign
probe PING
probe RADIUS
!
real 1.2.3.10
! slow AAA can need this sometimes, as we have real probing it is safe
! without it your reals might 'fail' regularly for no reason at all
no faildetect inband
inservice
!
real 1.2.3.11
no faildetect inband
inservice
!
ip slb vserver RADIUS
virtual 123.123.123.123 udp 1812 service radius
serverfarm RADIUS
sticky radius calling-station-id group 1
sticky radius framed-ip group 1
! 123.123.123.254 is your core's unique Loopback0 address, .253 is your other core box
! and so this would need reversing on the other SLB box
replicate casa 123.123.123.254 123.123.123.253 <port+0> password <ahem>
inservice
!
ip slb vserver RADIUS-ACCT
virtual 123.123.123.123 udp 1813 service radius
serverfarm RADIUS
sticky radius framed-ip group 2
replicate casa 123.123.123.254 123.123.123.253 <port+1> password <cough>
inserviceAnnoyingly we cannot use the following catchall stub for RADIUS auth and acct in one go as some clients will not pass a Calling-Station-Id in the accounting packet. When this occurs the SLB absorbs the packet and it never makes it to the real servers. So if you are trying to diagnose why some RADIUS packets are not making it to their destination on the other side of the SLB, check that it contains the 'Calling-Station-Id' for auth packets and a 'Username' attribute in the accounting packet.
ip slb vserver RADIUS virtual 123.123.123.123 udp 0 service radius serverfarm RADIUS sticky radius calling-station-id group 1 sticky radius framed-ip group 1 replicate casa 123.123.123.254 123.123.123.253 <port> password <ahem> inservice