Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

Simple Unix Backup Tools

News Recommended Links Tutorials Perl Backup Scripts Recommended Papers Configuration Management
Baseliners Unix tape archiver (Tar) Gzip IBM Remote System Management Tool TCM MValent
HP SRC CMDB Webmin CVS
(Software development)
Humor Etc

Backup in large corporation is one of the most badly managed areas if IT. Usually quite a bit of useless data are backup-uped and huge amount of money is thrown out of the Window.

More intelligent approach includes combining Ghost-like solutions, differential backups and local backups on mirrored drives which can reduce frequency of backup to tape (and associated costs) by an order of magnitude 

Backup types

Backups are usually run in one of three general forms:

Full backup

A full, or complete, backup saves all of the files on your system. Depending on circumstances, "all files" may mean all files on the system, all files on a physical disk, all files on a single partition, or all files that cannot be recovered from original installation media. Depending on the size of the drive being backed up, a full backup can take hours to complete.

Differential backup

Save only files that have been modified or created since the last full backup. Compared to full backups, differentials are relatively fast because of the reduced number of files written to the backup media. A typical differential scheme would include full backup media plus the latest differential media. Intermediate differential media are superseded by the latest and can be recycled.


Incremental backup
 

Save only files that have been modified or created since the last backup, including the last incremental backup. These backups are also relatively fast. A typical incremental backup would include full backup media plus the entire series of subsequent incremental media. All incremental media are required to reconstruct changes to the filesystem since the last full backup.

Typically, a full backup is coupled with a series of either differential backups or incremental backups, but not both. For example, a full backup could be run once per week with six daily differential backups on the remaining days. Using this scheme, a restoration is possible from the full backup media and the most recent differential backup media. Using incremental backups in the same scenario, the full backup media and all incremental backup media would be required to restore the system. The choice between the two is related mainly to the trade-off between media consumption (incremental backup requires more media) versus backup time (differential backup takes longer, particularly on heavily used systems).

For large organizations that require retention of historical data, a backup scheme longer than a week is created. Incremental or differential backup media are retained for a few weeks, after which the tapes are reformatted and reused. Full backup media are retained for an extended period, perhaps permanently. At the very least, one full backup from each month should be retained for a year or more.

A backup scheme such as this is called a media rotation scheme , because media are continually written, retained for a defined period, and then reused. The media themselves are said to belong to a media pool, which defines the monthly full, the weekly full, and differential or incremental media assignments, as well as when media can be reused. When media with full backups are removed from the pool for long-term storage, new media join the pool, keeping the size of the pool constant. Media may also be removed from the pool if your organization chooses to limit the number of uses media are allowed, assuming that reliability goes down as the number of passes through a tape mechanism increases.

Your organization's data storage requirements dictate the complexity of your backup scheme. On systems in which many people frequently update mission-critical data, a conservative and detailed backup scheme is essential. For casual-use systems, such as desktop PCs, only a basic backup scheme is needed, if at all.

Backup verification

To be effective, backup media must be capable of yielding a successful restoration of files. To ensure this, a backup scheme must also include some kind of backup verification in which recently written backup media are tested for successful restore operations. This could take the form of a comparison of files after the backup, an automated restoration of a select group of files on a periodic basis, or even a random audit of media on a recurring basis. However the verification is performed, it must prove that the media, tape drives, and programming will deliver a restored system. Proof that your backups are solid and reliable ensures that they will be useful in case of data loss.


Notes:
  • This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • The site contain some broken links as it develops like a living tree... Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.
Google Search
Open directory

Research Index


Old News ;-)

Creating Snapshot-Backups with BackerUpper On Ubuntu 9.04 HowtoForge - Linux Howtos and Tutorials

BackerUpper is a tool similar to Apple's TimeMachine. It is intended to create snapshot-backups of selected directories or even your full hard drive. From the BackerUpper project page: "Backerupper is a simple program for backing up selected directories over a local network. Its main intended purpose is backing up a user's personal data." This article shows how to install and use BackerUpper on Ubuntu 9.04 (Jaunty Jackalope).

10 outstanding Linux backup utilities 10 Things TechRepublic.com by Jack Wallen

July 21st, 2009

Whether you’re in the IT industry or you’re a computer power user, you need to have a backup tool at the ready. With this tool, you will need scheduled backups, one-time backups, local backups, remote backups, and many other features.

Plenty of proprietary solutions are out there. Some of them are minimal and cost effective, while others are feature-rich and costly. The open source community is no stranger to the world of backups. Here are 10 excellent backup solutions for the Linux operating system. In fact, some of these are actually cross platform and will back up Linux, Windows, and/or Mac.

Note: This article is also available as a PDF download.

  1. fwbackups This is, by far, the easiest of all the Linux backup solutions. It is cross platform, has a user-friendly interface, and can do single backups or recurring scheduled backups. The fwbackups tool allows you to do backups either locally or remotely in tar, tar.gz, tar.bZ, or rsync format. You can back up an entire computer or a single file. Unlike many backup utilities, fwbackups is easy to install because it will most likely be found in your distribution’s repository. Both backing up and restoring are incredibly easy (even scheduling a remote, recurring scheduled backup). You can also do incremental or differential backups to speed the process.
  2. Bacula Bacula is a powerful Linux backup solution, and it’s one of the few Linux open source backup solutions that’s truly enterprise ready. But with this enterprise readiness comes a level of complexity you might not find in any other solution. Unlike many other solutions, Bacula contains a number of components:
    • Director — This is the application that supervises all of Bacula.
    • Console — This is how you communicate with the Bacula Director.
    • File — This is the application that’s installed on the machine to be backed up.
    • Storage — This application performs the reading and writing to your storage space.
    • Catalog — This application is responsible for the databases used.
    • Monitor — This application allows the administer to keep track of the status of the various Bacula tools.

    Bacula is not the easiest backup solution to configure and use. It is, however, one of the most powerful. So if you are looking for power and aren’t concerned about putting in the time to get up to speed with the configuration, Bacula is your solution.

  3. Rsync Rsync is one of the most widely used Linux backup solutions. With rsync, you can do flexible incremental backups, either locally or remotely. Rsync can update whole directory trees and file systems; preserve links, ownerships, permissions, and privileges; use rsh, ssh, or direct sockets for connection; and support anonymous connections. Rsync is a command-line tool, although front ends are available (such as Grsync<http://freshmeat.net/projects/grsync/>). But the front ends defeat the flexibility of having a simple command-line backup tool. One of the biggest pluses of using a command-line tool is that you can create simple scripts to use, in conjunction with cron, to create automated backups. For this, rsync is perfect.
  4. Mondorescue Mondorescue is one of those tools you have around for disaster recovery because one of its strengths is backing up an entire installation. Another strength of Mondorescue is that it can back up to nearly any medium: CD, DVD, tape, NFS, hard disk, etc. And Mondo supports LVM 1/2, RAID, ext2, ext3, ext4, JFS, XFS, ReiserFS, and VFAT. If your file system isn’t listed, there is a call on the Mondo Web site to email the developers for a file system request and they will make it work. Mondo is used by large companies, such as Lockheed-Martin, so you know it’s reliable.
  5. Simple Backup Solutio Simple Backup Solution is primarily targeted at desktop backup. It can back up files and directories and allows regular expressions to be used for exclusion purposes. Because Simple Backup Solution uses compressed archives, it is not the best solution for backing up large amounts of pre-compressed data (such as multimedia files). One of the beauties of Simple Backup Solution is that it includes predefined backup solutions that can be used to back up directories, such as /var/, /etc/, /usr/local. SBS is not limited to predefined backups. You can do custom backups, manual backups, and scheduled backups. The user interface is user friendly. One of the downfalls of SBS is that it does not include a restore solution like fwbackups does.
  6. Amanda Amanda allows an administrator to set up a single backup server and back up multiple hosts to it. It’s robust, reliable, and flexible. Amanda uses native Linux dump and/or tar to facilitate the backup process. One nice feature is that Amanda can use Samba to back up Windows clients to the same Amanda server. It’s important to note that with Amanda, there are separate applications for server and client. For the server, only Amanda is needed. For the client, the Amanda-client application must be installed.
  7. Arkeia Arkeia is one of the big boys in the backup industry. If you are looking for enterprise-level backup-restore solutions (and even replication server solutions) and you don’t mind paying a premium, Arkeia is your tool. If you’re wondering about price, the Arkeia starter pack is $1,300.00 USD - which should indicate the seriousness of this solution.  Although Arkeia says it has small to midsize solutions, I think Arkeia is best suited for large business to enterprise-level needs.
  8. Back In Time Back In Time allows you to take snapshots of predefined directories and can do so on a schedule. This tool has an outstanding interface and integrates well with GNOME and KDE. Back In Time does a great job of creating dated snapshots that will serve as backups. However, it doesn’t use any compression for the backups, nor does it include an automated restore tool. This is a desktop-only tool.
  9. Box Backup Box Backup is unique in that not only is it fully automated but it can use encryption to secure your backups. Box Backup uses both a client daemon and server daemon, as well as a restore utility. Box Backup uses SSL certificates to authenticate clients, so connections are secure. Although Box Backup is a command-line solution, it is simple to configure and deploy. Data directories are configured, the daemon scans those directories, and if new data is found, it is uploaded to the server. There are three components to install: bbstored (backup server daemon), bbackupd (client daemon), and bbackupquery (backup query and restore tool). Box Backup is available for Linux, OpenBSD, Windows (Native only), NetBSD, FreeBSD, Darwin (OS X), and Solaris.

[Apr 6, 2009] backup-manager A Command Line Backup Tool

Written in bash and perl: that means it is pretty flexible
January 1, 2008  |  http://www2.backup-manager.org/

This is a backup program, designed to help you make daily archives of your file system.

Written in bash and perl, it can make tar, tar.gz, tar.bz2, and zip archives and can be run in a parallel mode with different configuration files. Other archives are possible: MySQL or SVN dumps, incremental backups…
Archives are kept for a given number of days and the upload system can use FTP, SSH or RSYNC to transfer the generated archives to a list of remote hosts.

The program is designed to be as easy to use as possible and is popular with desktop users and sysadmins. The whole backup process is defined in one full-documented configuration file which needs no more than 5 minutes to tune for your needs. It just works.

[Mar 10, 2009] rdiff-backup

rdiff-backup backs up one directory to another. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special directory so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup can also operate in a bandwidth- efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back up to a remote location, and only the differences will be transmitted. It can also handle symlinks, device files, permissions, ownership, etc., so it can be used on the entire file system.

[Mar 10, 2009] rsync

rsync user
by Richard Harris - Aug 28th 2005 15:08:19

I am the author of Walker, a Python script for uploading sites via ftp and scp. Walker is very good for maintaining sites of moderate size, for use over slow connections, for users with limited resources, and for users who need customized control over the upload.

For some time I maintained two and three sites using Walker. Now I am maintaining over ten sites and their related project files. I use rsync exclusively, called from python and ruby scripts which handle mirrored and standardized directory structures across the sites, in other word the sites and dev dirs are all to the same pattern. In this way, I am able to easily maintain HTML, data, and cgi-bin files and to back-up and restore the web sites and the project development files.

freshmeat.net Project details for DirSync

DirSync is a directory synchronizer that takes a source and destination directory as arguments and recursively ensures that the two directories are identical. It can be used to create incremental copies of large chunks of data. For example, if your file server's contents are in the directory /data, you can make a copy in a directory called /backup with the command "dirsync /data /backup." The first time you run it, all data will be copied. On subsequent runs, only the changed files are copied.

[Sep 9, 2008] GNU ddrescue 1.9-pre1 (Development) by Antonio Diaz Diaz

About: GNU ddrescue is a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying hard to rescue data in case of read errors. GNU ddrescue does not truncate the output file if not asked to. So, every time you run it on the same output file, it tries to fill in the gaps. The basic operation of GNU ddrescue is fully automatic. That is, you don't have to wait for an error, stop the program, read the log, run it in reverse mode, etc. If you use the logfile feature of GNU ddrescue, the data is rescued very efficiently (only the needed blocks are read). Also you can interrupt the rescue at any time and resume it later at the same point.

Changes: The new option "--domain-logfile" has been added. This release is also available in lzip format. To download the lzip version, just replace ".bz2" with ".lz" in the tar.bz2 package name.

[Aug 26, 2008] Easy local and remote backup of your home network - Lone Wolves - Web, game, and open source development

2006-09-20

I hate making backups by hand. It costs a lot of time and usually I have far better things to do. Long ago (in the Windows 98 era) I made backups to CD only before I needed to reïnstall the OS, which was about once every 18 months, and my code projects maybe twice as often. A lot has changed since those dark times though. My single PC expanded into a network with multiple desktops and a server, I installed a mix of Debian an Ubuntu and ditched Windows, and I have a nice broadband link - just as my friends do. Finally a lazy git like me can set up a decent backup system that takes care of itself, leaving me time to do the "better" things (such as writing about it :-)

There are already quite a few tutorials on the internet explaining various ways to backup your Linux system using built-in commands and a script of some sorts, but I could not find one that suited me so I decided to write another one - one that takes care of backing up my entire network.

[Feb 08, 2008] freshmeat.net Project details for ns4

ns4 is a configuration management tool which allows the automated backup of just about anything, but it was designed for routers and switches. If you are able to log into it through a CLI, you can back it up. Commands are defined within a configuration file, and when they are executed, the output is sent to a series of FTP servers for archiving. As well as archiving configurations, it allows scripts to be run on nodes; this allows configurations to be applied en masse and allows conditional logic so different bits of scripts are run on different nodes.

Recommended Links