Having to implement a backup system on my server I asked myself: What can I use?

The simplest choice would have been a daily copy of the folder; on the other end the most obvious one would have been rsync.

The main disadvantage of a simple daily copy is that it is not incremental. In the long run it needs a lot of disk space if you want to keep the old backups. The main advantages are that:

  • managing the snapshots (get the snapshot of a certain day or remove all the backups until a certain date) is super-easy
  • implementation is trivial.

The main pros of Rsync are that:

  • it solves the disk space problem because it does the incremental copy and saves only what has changed from the last synchronization
  • it’s quite easy to recover the snapshot of a certain date.

The cons are that:

  • It is hard to realize what changes are going on
  • it’s almost impossible synchronizing the development and production environments with the backup one.

At this point the question I posed myself was: why not to use GIT (that I already used between development and production) to create an integrated stack of the type development->production->backup?

And the answer was: “ok let’s do it”.

I considered using github but, given that my backups contain private data, I needed a private repository that on github is not cheap (about 85$ or 70€ per year).So I opted for autarky.

That solution allowed me to maintain privacy and to always have the entire history on three different environments (dev, prod, bkp), and with the installation of some simple product:

– it’s simple to create the git repositories and manage access through ssh keys using gitolite
– I have a web frontend for browsing the diff and versions with gitweb

I only needed to add a daily cronjob that every night from production commits and pushes all the changes on the backup environment to obtain an integrated GIT stack from develop to backup.

Clearly this did not include the database and the cronjob, before committing the folder, creates a database dump inside it.

I also used this technique to back up email accounts and /etc directory of my server.

The only problem, I’ll have to deal with when the repository will grow too mutch will be: How do I delete the history older than a certain date with GIT? I am comforted that even with rsync this operation is not immediate. At worst, I will create a new repository from my workingcopy. 😉