Yes, I should have configured backups a long time ago. Mea culpa. At any rate, they are running now. But there were bumps along the way, so allow me to present the good, the bad, and the bizarre of getting BackupPC up and saving my data.
The good
Once configured, it works great. As I type, two backups are humming right along. The interface for browsing backups is clear and intuitive. I’ll be testing a restore later and I expect it will work flawlessly.
The bad
Not much, really, though the bizarre things had me stuck so long I was thinking they were the bad.
The bizarre
More than a few. One or two in software, several in docware. I’ll cover only the worst ones that slowed me down as I set things up.
All of the documentation – and some of the code – seems to be written from the perspective of a mixed network (Linux and Windows) with considerable use of SMB. For example, if you indicate that a host is configured with DHCP, BackupPC defaults to querying it via nbmlookup.
I beg your pardon? All of our user machines are Linux and all are configured with DHCP. Why on Earth would nmblookup be expected to work? This utterly bizarre default leads to bizarrity number two.
The documentation – and various HowTo’s and intros, etc. – mention that when configuring BackupPC with knowledge of client hosts, one can indicate that hosts are configured with DHCP but that it isn’t really necessary to choose this, even they are, and that it might be better to not choose this.
Point of fact: If you are not using SMB at all, then, as far as I can tell, you MUST NOT indicate that hosts are configured with DHCP. As soon as you indicate DCHP, nmblookup is the default way of looking up hosts – none of the normal Linux/UNIX goodness seems to work. In my case, network history may have played a role, so permit a brief digression.
Our network began as a consumer home network and has evolved over time to reflect my deepening knowledge, my security requirements, and my underlying “if it ain’t broke don’t fix it” philosophy. One effect of this is that we don’t have local DNS – it’s on my list of ToDos, but we haven’t needed it yet. But! A lot of home users are likely in the same position.
What does this mean in practical terms?
You MUST populate /etc/hosts on the BackupPC server, you MUST use static address assignments in your DHCP server (in order for /etc/hosts to remain valid), and you MUST leave the “DHCP” box unchecked for each host.
Otherwise, BackupPC cannot find your clients.
Once BackupPC finds your clients, it still may not be able to connect to them. That is, unless you are a certified Linux guru with long experience with rsync and ssh.
Why? Because the documentation does not describe briefly and succinctly how BackupPC connects to a Linux client and how you have to configure that client. This was the “bizarre” that was almost a “bad”.
This is what the documentation – and the various HowTo’s, etc. – should say:
BackupPC needs to be able to connect to each Linux client automatically, without human intervention. The software defaults to attempting to do this securely using SSH and automatic certificate-based login. This is a good thing – it’s worthwhile doing the work to enable this sound default. This process has to be boot-strapped with a manual login and some copying of certificate files from the BackupPC server to each client machine. And these certificate files have to be created in the first place!
Step zero is to install BackupPC on the server, identify all of your client hosts, choose what data is to be backed up and where, etc. Once BackupPC is installed and seems ready to go, follow this procedure.
- Become root on the BackupPC host. E.g., “% su”, or “% sudo su”, or whatever gets you there.
- Become the BackupPC user: “$ su backuppc”
- Create an RSA key pair for BackupPC: “% ssh-keygen -t rsa” (You may want to read up on SSH – “man ssh” – to learn how to create stronger key pairs or use other algorithms. You will also need to decide whether or not to use passphrases; “no” is more convenient, “yes” is more secure, but requires use of ssh-add and the SSH agent.)
- Copy the public key to a convenient file, e.g., “% cp ~backuppc/.ssh/id_rsa.pub back.key”. Very important: The next few steps have to repeated on each client machine.
- Make sure sshd – or equivalent – is installed on the client machine. For example, login to the client machine at its console, and, for Debian-based systems, including Ubuntu, enter this command: sudo apt-get install openssh-server
- From the backuppc host, copy the file back.key to root’s home directory on the client machine, e.g., “% rcp back.key root@clientMachine:back.key”. If this is the first time you connect from the backuppc host to the client machine, you will be asked to confirm the client machine’s key. You will be prompted for root’s password. Enter it, and the copy should happen very quickly.
- Login to the client machine as root: “% ssh root@clientMachine”. You will be prompted for root’s password. Enter it, and proceed to the next and most crucial step.
- Using the copied RSA certificate, enable the backuppc user to login automatically as root on clientMachine without entering a password. This is done by appending the RSA certificate to the list of authorized hosts: “$ cat back.key >> /.ssh/authorized_keys”. Logout of the client machine.
- Login to the client machine as root: “% ssh root@clientMachine”.
If steps 5 through 8 worked as they should, you should login automatically, without being prompted for a password. (Refer to comments re passphrases and SSH agent, above.)
Now backuppc is ready to connect to start backups. To verify this, open a browser, connect to your backuppc server, e.g., http://backuppc/backuppc, choose one of your hosts, and click on “Start full backup”. Refresh the page and backuppc should tell you that the backup is running.
If the backup fails, check the logs for errors.
If you ever see the log entry “Got fatal error during xfer (Unable to read 4 bytes)”, then backuppc is not able to connect automatically as root. You need to verify steps 5 through 8 above to make sure that backuppc will not be prompted for root’s password and to make sure the client machine’s sshd certificate has not changed.
Anything else? Well, I had some trouble moving the backuppc home directory from its default of /var/lib/backuppc to /home/backuppc, but that might have been finger trouble. In a nutshell, here is the distillation of the steps I took – that is, this is what I figured out the procedure to be, after poking about for while. This is abbreviated, if you don’t understand what a step is doing, don’t do any of them! (I may update this later with background, but for now, well, caveat emptor.)
$ /etc/init.d/backuppc stop
$ mv /var/lib/backuppc /home
(depending on your OS and filesystem, you may want to use “cp -dpR” instead)
$ usermod –home /home/backuppc backuppc
$ vi /etc/backuppc/config.pl – around about line 315, “change $Conf{TopDir} = ”;” to
“$Conf{TopDir} = ‘/home/backuppc’;” and “$Conf{LogDir} = ”;” to “$Conf{LogDir} = ‘/home/backuppc/log’;”
$ /etc/init.d/backuppc start
I don’t like making the suggestion of editing config.pl, but try as I might, I was unable to move the data directory without doing this. The problem was the LogDir setting: I could override TopDir successfully using the web interface, but not LogDir. Whenever I tried starting BackupPC, I would get errors about not being able to create /var/lib/backuppc/log and the startup script would exit.
Note also that I haven’t gone back and validated this procedure with a clean install. My install works, it ain’t broke, I ain’t touchin’ it unless I need to!
Final bizarrity: If you see the message “because HOST has been on the network at least 7 consecutive times, it will not be backed up from” followed by a schedule, then BackupPC has decided that HOST is always on (possibly a server) and that it backups of this machine should default to “outside of business hours”. The stated intent is to reduce network load during business hours. I’ve disabled this – more on this below – for my own machines and am not sure how I feel about it in general.
For example, if a machine is a server, business hours are when files are changing the most and performing a backup during periods of high change may cause backup inconsistencies. On the other hand, periods of high change are exactly the time when users might need frequent backups all the more. And of course running a backup on a loaded server may degrade service!
The best approach is likely to take backups when the server is quiet(er) and to ensure that users make use of a revisioning system, preferably one backed by a SAN or a RAID array, etc. This addresses the problems of backup inconsistency and PEBCAD/OOPS during periods of frequent change.
My choice was to disable this blackout period by default (“Edit Config”, “Schedule”, set BlackoutGoodCnt to -1) and to enable for the one machine to which it really applies (a web server; select the host, “Edit Config” at the top of the list, “Schedule”, click override and set BlackoutGoodCnt to 7).
What kind of machine is this, anyway?
As for the server assumption, that’s a little bizzare when you consider that BackupPC is written by Linux geeks and the primary audience is likely Linux geeks (either Geeks@Home, like me, or Geeks@Work who’ve been directed by the PHB to do something about backups). Why bizarre?
Because many geeks are also gamers, many gamers need a Windows box, and many Linux geek gamers use dual-boot to solve this problem. BackupPC was able to ping one of my machines 7 consecutive times because it was being used for gaming (a long day’s gaming :->). It assumed this meant the machine was “always on” and therefore decided it would be backed up only evenings and weekends.
The logic is both understandable and bizarre. Slightly more understandable and less bizarre would have been “machine was pinged AND backups were possible” 7 consecutive times. Backups were NOT possible for this machine, because it is configured for rsync over SSH, which is available only when it runs Linux; when it runs Windows, I’d just as soon it not be on the net at all….
But even with that AND, the decision to back up outside of business hours seems a little heavy handed. After all, when one configures backups, one’s desire is to back up data. IMHO, backing it up as available should be the default, exceptions for schedules or host classes should be just that, exceptions. Decisions about exceptions should be made by humans, not software.
Some design suggestions
- A better alternative would be to implement machine classes, that is, the ability to tag machines as “Linux desktop”, “Linux laptop”, “Linux server”, “Windows desktop”, “LAMP server”, etc., and to then apply intelligent defaults.
For example, default LAMP server to rsync over SSH backups evenings and weekends. Default Windows desktop to nmblookup and smbclient. Etc.
- Flag exceptions or take note of what might exceptional circumstances and present the administrator with choices, instead of implementing an exceptional behaviour, such as the blackout. “This machine appears to always on – would you like to schedule its backups for evenings and weekends only?”
- Provide a simple tool for moving the BackupPC pool or home directory. These are exceptional activities: Exceptional activities require the best documentation and the best tools, because they are the things we do the least, and often the things we most need to get right. That’s why dialing emergency services is so much easier than dialing your friend in across the country, despite the fact that you do the latter much more often the former.
- EITHER rename the “DHCP” column to “SMB” or something else that better reflects what the software does OR change the default behaviour. This requires a bit of thought, but it might be as simple as a little wizard. For example, the administrator enters a hostname and BackupPC attempts a ping, resolving first with gethostbyname then by nmblookup. If the ping works, BackupPC records the host with intelligent defaults, e.g., rsync for gethostbyname, smbclient for nmblookup. If the ping fails, then BackupPC asks a few questions to resolve the situation (“should I be able to connect now”, “how should I resolve the name”, “do you want to enter the IP and have me add this to /etc/hosts”, etc.)
- Display logs most-recent-first. Usually I want to see the most recent item first, so save me the trouble, small as it is, of scrolling to the end….