This week a hard disk from one of my servers broke. It has been replaced with just 3 minutes of downtime (kudos to OVH), and right now the RAID array is being rebuilt.
Since this is the first time this happens to me, and the server is running a production mailman service, I decided to take my time and set-up a backup MX server. This is how I did it.
Remember that I am using a Gentoo distribution, this may be slightly different in other distro.
rsync
The main problem setting up a backup MX is dealing with Spam. If the backup server accepts a spam message while the main server is down, and then forwards the email to the main server when it is up again, but the main server rejects it because the recipient address does not exist, then what to do? If the backup server bounces to the sending address, and this has been forged, then we are backscattering (forwarding spam), and our server will soon be blacklisted.
The point is the backup server should refuse incoming email if the main server would. This means checking the recipient address. What I did is to set-up a cron script to rsync the lists from the main server to the backup server every n minutes.
In the main server I open rsync access for the backup server:
/etc/rsyncd.conf
[mailman]
comment = Mailman sync
path = /var/lib/mailman
list = no
uid = mailman
gid = mailman
hosts allow = IP-ADDRESS-OF-THE-BACKUP-SERVER
hosts deny = *
This is the script in the backup server:
/etc/bin/mm-sync.sh
#!/bin/sh
START=`date`
# data
rsync -avz --delete MAIN-SERVER::mailman/data /var/lib/mailman/ > /tmp/mm-sync.log
echo >> /tmp/mm-sync.log
# lists
rsync -avz --delete MAIN-SERVER::mailman/lists /var/lib/mailman/ >> /tmp/mm-sync.log
echo >> /tmp/mm-sync.log
# archives
rsync -avz --delete MAIN-SERVER::mailman/archives /var/lib/mailman/ >> /tmp/mm-sync.log
echo >> /tmp/mm-sync.log
END=`date`
echo $START >> /tmp/mm-sync.log
echo $END >> /tmp/mm-sync.log
mail -s 'Mailman sync' root < /tmp/mm-sync.log
Note that I am syncing three folders: data, lists and archives. Actually you only need to sync the lists folder to set-up a backup mx. I sync everything so the backup server can replace the main server completely, if things go really bad.
And this is the cron file:
/etc/cron.d/mm-sync
*/10 * * * * mailman /etc/bin/mm-sync.sh
Exim
The exim configuration looks a lot like the configuration of the main server (for instance, I set-up the same blacklist checking rules). See my original post on setting up a Mailman service with exim.
First I define the same variables and options as in the main server. Not of them are really needed, but I prefer to keep the configuration in both servers as close possible.
/etc/exim/exim.conf
# Mailman
MM_HOME=/usr/lib/mailman
MM_DATA=/var/lib/mailman
MM_UID=mailman
MM_GID=mailman
domainlist mm_domains=example.com : example2.com
MM_WRAP=MM_HOME/mail/mailman
MM_LISTCHK=MM_DATA/lists/${lc::$local_part}/config.pck
smtp_accept_queue_per_connection = 30
Now, these mailman domains (mm_domains) are not defined as local domains, like in the main server, but as domains we relay to:
domainlist local_domains = @ domainlist relay_to_domains = +mm_domains hostlist relay_from_hosts = 127.0.0.1 : ::::1
Now comes the router:
# Mailman mailman_router: driver = dnslookup domains = +mm_domains require_files = MM_LISTCHK local_part_suffix_optional local_part_suffix = -admin : \ -bounces : -bounces+* : \ -confirm : -confirm+* : \ -join : -leave : \ -owner : -request : \ -subscribe : -unsubscribe transport = remote_smtp no_more
This is very much like the router used in the main server. The differences are: use the dnslookup driver instead of accept ; use the remote_smtp transport ; and add the no_more option. In other words, configure the router for remote delivery instead of local delivery.
Last, we need to modify the dnslookup router to not consider the mailman domains, since we have already handled them in the mailman router:
dnslookup: driver = dnslookup domains = ! +local_domains : ! +mm_domains transport = remote_smtp ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8 no_more
You should test the router for both valid and invalid recipient addresses:
# exim -bt foobar@example.com router = mailman_router, transport = remote_smtp host example.com [AAA.BBB.CCC.DDD] MX=10 # exim -bt foobarrr@example.com foobarrr@example.com is undeliverable: Unknown user
The DNS
Now remember to add your backup MX server to the DNS configuration. This is what it looks like mine:
$ host example.com example.com has address AAA.BBB.CCC.DDD example.com mail is handled by 10 smtp.example.com. example.com mail is handled by 20 mx.example.com.
Note that the lowest the number, the highest the priority. This is a little confusing at first.
Testing
Now, how do you test this? There is a wonderful utility for testing email servers, swaks. I did the tests from my workstation:
localhost ~ $ sudo emerge swaks
While doing the tests, open a console on every server and watch the exim log (tail -f /var/log/exim/exim_main.log). First test is to check the backup server forwards emails to the main server:
localhost ~ $ swaks --to foobar@example.com --from toto@gmail.com --server mx.example.com
Second test. Simulate a downtime of the main server. To do this I use iptables in the main server to block connections from the backup server to the SMTP port:
smtp.example.com ~ # iptables -A INPUT -s BACKUP-SERVER-IP -p tcp --destination-port 25 -j DROP localhost ~ $ swaks --to foobar@example.com --from toto@gmail.com --server mx.example.com
Wait a few minutes, and then open access again:
smtp.example.com ~ # iptables -D INPUT -s BACKUP-SERVER-IP -p tcp --destination-port 25 -j DROP
That’s all folks! Something wasn’t clear? Please drop a comment.
Leave a Reply
You must be logged in to post a comment.