networked day to day technical issues

7Jan/170

A Tool to Backup Files to Amazon S3

For the past year I've been working on and off on a little project to create a tool which:

  • runs on at least Linux, MacOS X and FreeBSD
  • allows to backup your files to Amazon S3 while providing optional server side encryption (AES-256)
  • is cost effective for large numbers of files (the problem with things like s3cmd or aws s3 sync is that they need to compare local files with metadata retrieved on the fly from AWS and this can get expensive)
  • is easy to install
  • provides meaningful error messages and the possibility to debug

I've ended up creating a tool called S3backuptool (yeah, not that original) which does the above and in order to run it requires Python 2.7 , PyCrypto and the Boto library.

Details are available on the project's page and it can be installed from prebuilt packages (deb or rpm) for several Linux distributions or from Python's PyPi for far more Linux distributions and OSes.

So far it's been quite the educative enterprise while also catering to my needs.

Metadata about all backed up files is stored locally in SQLite database(s) and in S3 as metadata for each uploaded file. When a backup job runs it compares the state of files with the one stored in the local SQLite database(s) and action is needed on S3 only then actual S3 api calls are performed (those cost money). In case the local SQLite databases are lost then they can be reconstructed from the S3 stored metadata.