May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Solaris Backup/Restore

News Recommended Links Recommended Books Backup media and utilities Recovering file systems Flash archives  
tar dd cpio pax ufsdump ufsrestore


volcopy compress
gzip zip bzip2 rar Tips

"Native" backup and recovery in Solaris show signs of 30 years of Unix development. Frankly speaking it is potpourri of  almost a dozen partially incompatible utilities plus equal amount of GNU clones if you install them (gzip is installed by default). Many utilities duplicate each other and none is very competitive with the best Windows backup and recovery tools (as exemplified by rar and Ghost to name a few; there are close of Ghost for linux).

Most native Solaris tools support ACLs, most GNU tools don't (although in some combinations like Solaris tar + GNU gzip (tar.gz or tgz archives) can support them. Solaris' approach to handling ACLs in cpio and tar archives is using two files of the same name. The first file held the ACL info, and the second file was the actual file.  If you weren't running on Solaris, the second file would simply overwrite the first one, so the format was, for all intents and purposes, 100% backward compatible.

Here is a "slightly skeptical" characterization of some:

Comparing tar, cpio, and dump

There is a very old paper by John Pezzano from Hewlett-Packard comparing three backup utilities:





Simplicity of invocation

Very simple

(tar c files)

Needs find to specify filenames

Simple—few options

Recovery from I/O errors

None—write your own utility

Resync option on HP-UX will cause some data loss

Automatically skips over bad section

Back up special files

Later revisions



Multivolume backup

Later revisions



Back up across network

Using rsh only

Using rsh only


Append files to backup

Yes (tar -r)



Multiple independent backups on single tape




Ease of listing files on the volume

Difficult—must search entire backup

(tar -t)

Difficult—must search entire backup

(cpio -it)

Simple—index at front

(restore -t)

Ease and speed of finding a particular file

Difficult—no wildcards, must search entire volume

Moderate—wildcards, must search entire volume

Interactive—very easy with commands like cd, ls

Incremental backup


Must use find to locate new/modified files

Incremental of whole filesystem only, multiple levels

List files as they are being backed up

tar cvf 2>logfile

cpio -v 2>logfile

Only after backup with restore -t >logfile

(dump can show % complete, though)

Back up based on other criteria


find can use multiple criteria


Restore absolute pathnames to relative location

Only by using chroot

Limited with cpio -I

Always relative to current working directory

Interactive decision on restore

Yes or no possible with tar -w

Can specify new path or name on each file

Specify individual files in interactive mode


Multiple platform

Multiple platform with ASCII header, not always portable

Readable between some platforms, but cannot be relied on

Primary usefulness

Individual user backup, transfer files between filesystems

System backup, transfer files between filesystems

System backup

Volume efficiency

Medium, usually limited to 10 K block size

Medium, usually only 5 K block size, but can specify larger size on some OSes

High, can usually specify up to maximum block size of device

Wildcards on restore



Only in interactive mode

Simplicity of selecting files for backup from numerous directories

Low—must specify each independent directory, subdirectories included

Medium—find options

None—will back up one and only one filesystem

Specifying directory on restore get files in that directory


No—must use path/*


Stop reading tape after a restored file is found



Will stop reading tape as soon as last file is found

Track deleted files



If you restore with -r, files deleted before last incremental dump will be deleted

Filesystem efficiency


Worst (files get a stat from both find and cpio)


Limit on path length(tests done with Solaris native utilities 7/99)

155 characters. Complains "prefix is greater than 155 characters." gtar has slight workaround

255 characters. Doesn't complain. Just truncates pathname to 255 chars

1056 characters.

Likelihood that file exists in TOC but not in archive



Medium (since TOC is made first)


Standard Unix backup utilities may not be very sexy or even full of features, but if you get to know them, they will always be there. Some of the "seminative" commands (e.g., tar) are also very helpful. Therefore, a good working knowledge of the truly native commands can come in very handy when you're in a jam or when someone hands you an unknown volume and says "Can you read this?"
Top Visited
Past week
Past month


Old News ;-)

[Aug 29, 2017] -- A script to backup the /etc directory

This is simple script that generated "dot" progression lines. Backup name includes a timestamp. No rotation implemented.
Aug 29, 2017 |
# Script to backup the /etc heirarchy
# Written 4/2002 by Wayne Pollock, Tampa Florida USA
#  $Id: backup-etc,v 1.6 2004/08/25 01:42:26 wpollock Exp $
# $Log: backup-etc,v $
# Revision 1.6  2004/08/25 01:42:26  wpollock
# Changed backup name to include the hostname and 4 digit years.
# Revision 1.5  2004/01/07 18:07:33  wpollock
# Fixed dots routine to count files first, then calculate files per dot.
# Revision 1.4  2003/04/03 08:10:12  wpollock
# Changed how the version number is obtained, so the file
# can be checked out normally.
# Revision 1.3  2003/04/03 08:01:25  wpollock
# Added ultra-fancy dots function for verbose mode.
# Revision 1.2  2003/04/01 15:03:33  wpollock
# Eliminated the use of find, and discovered that tar was working
# as intended all along!  (Each directory that find found was
# recursively backed-up, so for example /etc, then /etc/mail,
# caused /etc/mail/ to be backuped three times.)
# Revision 1.1  2003/03/23 18:57:29  wpollock
# Modified by Wayne Pollock:
# Discovered not all files were being backed up, so
# added "-print0 --force-local" to find and "--null -T -"
# to tar (eliminating xargs), to fix the problem when filenames
# contain metacharacters such as whitespace.
# Although this now seems to work, the current version of tar
# seems to have a bug causing it to backup every file two or
# three times when using these options!  This is still better
# than not backing up some files at all.)
# Changed the logger level from "warning" to "error".
# Added '-v, --verbose' options to display dots every 60 files,
# just to give feedback to a user.
# Added '-V, --version' and '-h, --help' options.
# Removed the lock file mechanism and backup file renaming
# (from foo to foo.1), in favor of just including a time-stamp
# of the form "yymmdd-hhmm" to the filename.


# The backups should probably be stored in /var somplace:
TIMESTAMP=$(date '+%Y%m%d-%H%M')

VERSION=$(echo $Revision: 1.6 $ |awk '{print$2}')

{  echo "This script creates a full backup of /etc via tar in $REPOSITORY."
   echo "Usage: $PROG [OPTIONS]"
   echo '  Options:'
   echo '    -v, --verbose   displays some feedback (dots) during backup'
   echo '    -h, --help      displays this message'
   echo '    -V, --version   display program version and author info'

{  MAX_DOTS=50
   NUM_FILES=`find /etc|wc -l`
   bold=`tput smso`
   norm=`tput rmso`
   tput sc
   tput civis
   echo -n "$bold(00%)$norm"
   while read; do
      let "cnt = (cnt + 1) % FILES_PER_DOT"
      if [ "$cnt" -eq 0 ]
         let '++num_dots'
         let 'percent = (100 * num_dots) / MAX_DOTS'
         [ "$percent" -gt "100" ] && percent=100
         tput rc
         printf "$bold(%02d%%)$norm" "$percent"
         tput smir
         echo -n "."
         tput rmir
   tput cnorm

# Command line argument processing:
while [ $# -gt 0 ]
   case "$1" in
      -v|--verbose)  VERBOSE=on; ;;
      -h|--help)     usage; exit 0; ;;
      -V|--version)  echo -n "$PROG version $VERSION "
                     echo 'Written by Wayne Pollock '
                     exit 0; ;;
      *)             usage; exit 1; ;;

trap "rm -f $ERRMSGS" EXIT

cd /etc

# create backup, saving any error messages:
if [ "$VERBOSE" != "on" ]
    tar -cz --force-local -f $FILE . 2> $ERRMSGS 
    tar -czv --force-local -f $FILE . 2> $ERRMSGS | dots

# Log any error messages produced:
if [ -s "$ERRMSGS" ]
then logger -p user.error -t $PROG "$(cat $ERRMSGS)"
else logger -t $PROG "Completed full backup of /etc"

exit 0

[Aug 28, 2017] Rsync over ssh with root access on both sides

Aug 28, 2017 |

I have one older ubuntu server, and one newer debian server and I am migrating data from the old one to the new one. I want to use rsync to transfer data across to make final migration easier and quicker than the equivalent tar/scp/untar process.

As an example, I want to sync the home folders one at a time to the new server. This requires root access at both ends as not all files at the source side are world readable and the destination has to be written with correct permissions into /home. I can't figure out how to give rsync root access on both sides.

I've seen a few related questions, but none quite match what I'm trying to do.

I have sudo set up and working on both servers. ubuntu ssh debian rsync root

share improve this question asked Apr 28 '10 at 9:18 Tim Abell 732 20
add a comment | 3 Answers active oldest votes
up vote down vote accepted Actually you do NOT need to allow root authentication via SSH to run rsync as Antoine suggests. The transport and system authentication can be done entirely over user accounts as long as you can run rsync with sudo on both ends for reading and writing the files.

As a user on your destination server you can suck the data from your source server like this:

sudo rsync -aPe ssh --rsync-path='sudo rsync' boron:/home/fred /home/

The user you run as on both servers will need passwordless* sudo access to the rsync binary, but you do NOT need to enable ssh login as root anywhere. If the user you are using doesn't match on the other end, you can add user@boron: to specify a different remote user.

Good luck.

*or you will need to have entered the password manually inside the timeout window.

share improve this answer edited Jun 30 '10 at 13:51 answered Apr 28 '10 at 22:06 Caleb 9,089 27 43
Although this is an old question I'd like to add word of CAUTION to this accepted answer. From my understanding allowing passwordless "sudo rsync" is equivalent to open the root account to remote login. This is because with this it is very easy to gain full root access, e.g. because all system files can be downloaded, modified and replaced without a password. – Ascurion Jan 8 '16 at 16:30
add a comment |
up vote down vote If your data is not highly sensitive, you could use tar and socat. In my experience this is often faster as rsync over ssh.

You need socat or netcat on both sides.

On the target host, go to the directory where you would like to put your data, after that run: socat TCP-LISTEN:4444 - | tar xzf -

If the target host is listening, start it on the source like: tar czf - /home/fred /home/ | socat - TCP:ip-of-remote-server:4444

For this setup you'll need a reliably connection between the 2 servers.

share improve this answer answered Apr 28 '10 at 21:20 Jeroen Moors
Good point. In a trusted environment, you'll pick up a lot of speed by not encrypting. It might not matter on small files, but with GBs of data it will. – pboin May 18 '10 at 10:53
add a comment |
up vote down vote Ok, i've pieced together all the clues to get something that works for me.

Lets call the servers "src" & "dst".

Set up a key pair for root on the destination server, and copy the public key to the source server:

dest $ sudo -i
dest # ssh-keygen
dest # exit
dest $ scp /root/ src:

Add the public key to root's authorized keys on the source server

src $ sudo -i
src # cp /home/tim/ .ssh/authorized_keys

Back on the destination server, pull the data across with rsync:

dest $ sudo -i
dest # rsync -aP src:/home/fred /home/

[Aug 28, 2017] Unix Rsync Copy Hidden Dot Files and Directories Only by Vivek Gite

Feb 06, 2014 |
November 9, 2012 February 6, 2014 in Categories Commands , File system , Linux , UNIX last updated February 6, 2014

How do I use the rsync tool to copy only the hidden files and directory (such as ~/.ssh/, ~/.foo, and so on) from /home/jobs directory to the /mnt/usb directory under Unix like operating system?

The rsync program is used for synchronizing files over a network or local disks. To view or display only hidden files with ls command:

ls -ld ~/.??*


ls -ld ~/.[^.]*

Sample outputs:

ls command: List only hidden files in Unix / Linux terminal

Fig:01 ls command to view only hidden files

rsync not synchronizing all hidden .dot files?

In this example, you used the pattern .[^.]* or .??* to select and display only hidden files using ls command . You can use the same pattern with any Unix command including rsync command. The syntax is as follows to copy hidden files with rsync:

rsync -av /path/to/dir/.??* /path/to/dest
rsync -avzP /path/to/dir/.??* /mnt/usb
rsync -avzP $HOME/.??*
rsync -avzP ~/.[^.]*

rsync -av /path/to/dir/.??* /path/to/dest rsync -avzP /path/to/dir/.??* /mnt/usb rsync -avzP $HOME/.??* rsync -avzP ~/.[^.]*

In this example, copy all hidden files from my home directory to /mnt/test:

rsync -avzP ~/.[^.]* /mnt/test

rsync -avzP ~/.[^.]* /mnt/test

Sample outputs:

Rsync example to copy only hidden files

Fig.02 Rsync example to copy only hidden files

Vivek Gite is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter , Facebook , Google+ .

[Aug 28, 2017] rsync doesn't copy files with restrictive permissions

Aug 28, 2017 |
up vote down vote favorite Trying to copy files with rsync, it complains:
rsync: send_files failed to open "VirtualBox/Machines/Lubuntu/Lubuntu.vdi" \
(in media): Permission denied (13)

That file is not copied. Indeed the file permissions of that file are very restrictive on the server side:

-rw-------    1 1000     1000     3133181952 Nov  1  2011 Lubuntu.vdi

I call rsync with

sudo rsync -av --fake-super root@sheldon::media /mnt/media

The rsync daemon runs as root on the server. root can copy that file (of course). rsyncd has "fake super = yes" set in /etc/rsyncd.conf.

What can I do so that the file is copied without changing the permissions of the file on the server? rsync file-permissions

share improve this question asked Dec 29 '12 at 10:15 Torsten Bronger 207
If you use RSync as daemon on destination, please post grep rsync /var/log/daemon to improve your question – F. Hauri Dec 29 '12 at 13:23
add a comment |
1 Answer active oldest votes
up vote down vote As you appear to have root access to both servers have you tried a: --force ?

Alternatively you could bypass the rsync daemon and try a direct sync e.g.

rsync -optg --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose --recursive --delete-after --force  root@sheldon::media /mnt/media
share improve this answer edited Jan 2 '13 at 10:55 answered Dec 29 '12 at 13:21 arober11 376
Using ssh means encryption, which makes things slower. --force does only affect directories, if I read the man page correctly. – Torsten Bronger Jan 1 '13 at 23:08
Unless your using ancient kit, the CPU overhead of encrypting / decrypting the traffic shouldn't be noticeable, but you will loose 10-20% of your bandwidth, through the encapsulation process. Then again 80% of a working link is better than 100% of a non working one :) – arober11 Jan 2 '13 at 10:52
do have an "ancient kit". ;-) (Slow ARM CPU on a NAS.) But I now mount the NAS with NFS and use rsync (with "sudo") locally. This solves the problem (and is even faster). However, I still think that my original problem must be solvable using the rsync protocol (remote, no ssh). – Torsten Bronger Jan 4 '13 at 7:55

[Aug 28, 2017] Using rsync under target user to copy home directories

Aug 28, 2017 |

up vote down vote favorite

nixnotwin , asked Sep 21 '12 at 5:11

On my Ubuntu server there are about 150 shell accounts. All usernames begin with the prefix u12.. I have root access and I am trying to copy a directory named "somefiles" to all the home directories. After copying the directory the user and group ownership of the directory should be changed to user's. Username, group and home-dir name are same. How can this be done?

Gilles , answered Sep 21 '12 at 23:44

Do the copying as the target user. This will automatically make the target files. Make sure that the original files are world-readable (or at least readable by all the target users). Run chmod afterwards if you don't want the copied files to be world-readable.
getent passwd |
awk -F : '$1 ~ /^u12/ {print $1}' |
while IFS= read -r user; do
  su "$user" -c 'cp -Rp /original/location/somefiles ~/'

[Aug 28, 2017] rsync over SSH preserve ownership only for www-data owned files

Aug 28, 2017 |
up vote 10 down vote favorite 4

jeffery_the_wind , asked Mar 6 '12 at 15:36

I am using rsync to replicate a web folder structure from a local server to a remote server. Both servers are ubuntu linux. I use the following command, and it works well:
rsync -az /var/www/ user@

The usernames for the local system and the remote system are different. From what I have read it may not be possible to preserve all file and folder owners and groups. That is OK, but I would like to preserve owners and groups just for the www-data user, which does exist on both servers.

Is this possible? If so, how would I go about doing that?


** EDIT **

There is some mention of rsync being able to preserve ownership and groups on remote file syncs here:

** EDIT 2 **

I ended up getting the desired affect thanks to many of the helpful comments and answers here. Assuming the IP of the source machine is and the IP of the destination machine is I can use this line from the destination machine:

sudo rsync -az user@ /var/www/

This preserves the ownership and groups of the files that have a common user name, like www-data. Note that using rsync without sudo does not preserve these permissions.

ghoti , answered Mar 6 '12 at 19:01

You can also sudo the rsync on the target host by using the --rsync-path option:
# rsync -av --rsync-path="sudo rsync" /path/to/files user@targethost:/path

This lets you authenticate as user on targethost, but still get privileged write permission through sudo . You'll have to modify your sudoers file on the target host to avoid sudo's request for your password. man sudoers or run sudo visudo for instructions and samples.

You mention that you'd like to retain the ownership of files owned by www-data, but not other files. If this is really true, then you may be out of luck unless you implement chown or a second run of rsync to update permissions. There is no way to tell rsync to preserve ownership for just one user .

That said, you should read about rsync's --files-from option.

rsync -av /path/to/files user@targethost:/path
find /path/to/files -user www-data -print | \
  rsync -av --files-from=- --rsync-path="sudo rsync" /path/to/files user@targethost:/path

I haven't tested this, so I'm not sure exactly how piping find's output into --files-from=- will work. You'll undoubtedly need to experiment.

xato , answered Mar 6 '12 at 15:39

As far as I know, you cannot chown files to somebody else than you, if you are not root. So you would have to rsync using the www-data account, as all files will be created with the specified user as owner. So you need to chown the files afterwards.

user2485267 , answered Jun 14 '13 at 8:22

I had a similar problem and cheated the rsync command,

rsync -avz --delete root@x.x.x.x:/home//domains/site/public_html/ /home/domains2/public_html && chown -R wwwusr:wwwgrp /home/domains2/public_html/

the && runs the chown against the folder when the rsync completes successfully (1x '&' would run the chown regardless of the rsync completion status)

Graham , answered Mar 6 '12 at 15:51

The root users for the local system and the remote system are different.

What does this mean? The root user is uid 0. How are they different?

Any user with read permission to the directories you want to copy can determine what usernames own what files. Only root can change the ownership of files being written .

You're currently running the command on the source machine, which restricts your writes to the permissions associated with user@ Instead, you can try to run the command as root on the target machine. Your read access on the source machine isn't an issue.

So on the target machine (, assuming the source is

# rsync -az user@ /var/www/

Make sure your groups match on both machines.

Also, set up access to user@ using a DSA or RSA key, so that you can avoid having passwords floating around. For example, as root on your target machine, run:

# ssh-keygen -d

Then take the contents of the file /root/.ssh/ and add it to ~user/.ssh/authorized_keys on the source machine. You can ssh user@ as root from the target machine to see if it works. If you get a password prompt, check your error log to see why the key isn't working.

ghoti , answered Mar 6 '12 at 18:54

Well, you could skip the challenges of rsync altogether, and just do this through a tar tunnel.
sudo tar zcf - /path/to/files | \
  ssh user@remotehost "cd /some/path; sudo tar zxf -"

You'll need to set up your SSH keys as Graham described.

Note that this handles full directory copies, not incremental updates like rsync.

The idea here is that:

[Aug 28, 2017] rsync and file permissions

Aug 28, 2017 |
up vote down vote favorite I'm trying to use rsync to copy a set of files from one system to another. I'm running the command as a normal user (not root). On the remote system, the files are owned by apache and when copied they are obviously owned by the local account (fred).

My problem is that every time I run the rsync command, all files are re-synched even though they haven't changed. I think the issue is that rsync sees the file owners are different and my local user doesn't have the ability to change ownership to apache, but I'm not including the -a or -o options so I thought this would not be checked. If I run the command as root, the files come over owned by apache and do not come a second time if I run the command again. However I can't run this as root for other reasons. Here is the command:

/usr/bin/rsync --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir
unix rsync
share improve this question edited May 2 '11 at 23:53 Gareth 13.9k 11 44 58 asked May 2 '11 at 23:43 Fred Snertz 11
Why can't you run rsync as root? On the remote system, does fred have read access to the apache-owned files? – chrishiestand May 3 '11 at 0:32
Ah, I left out the fact that there are ssh keys set up so that local fred can become remote root, so yes fred/root can read them. I know this is a bit convoluted but its real. – Fred Snertz May 3 '11 at 14:50
Always be careful when root can ssh into the machine. But if you have password and challenge response authentication disabled it's not as bad. – chrishiestand May 3 '11 at 17:32
add a comment |
1 Answer active oldest votes
up vote down vote Here's the answer to your problem:
-c, --checksum
      This changes the way rsync checks if the files have been changed and are in need of a  transfer.   Without  this  option,
      rsync  uses  a "quick check" that (by default) checks if each file's size and time of last modification match between the
      sender and receiver.  This option changes this to compare a 128-bit checksum for each file  that  has  a  matching  size.
      Generating  the  checksums  means  that both sides will expend a lot of disk I/O reading all the data in the files in the
      transfer (and this is prior to any reading that will be done to transfer changed files), so this  can  slow  things  down

      The  sending  side  generates  its checksums while it is doing the file-system scan that builds the list of the available
      files.  The receiver generates its checksums when it is scanning for changed files, and will checksum any file  that  has
      the  same  size  as the corresponding sender's file:  files with either a changed size or a changed checksum are selected
      for transfer.

      Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by  checking
      a  whole-file  checksum  that is generated as the file is transferred, but that automatic after-the-transfer verification
      has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check.

      For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5.  For older protocols, the checksum  used
      is MD4.

So run:

/usr/bin/rsync -c --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir

Note there may be a time+disk churn tradeoff by using this option. Personally, I'd probably just sync the file's mtimes too:

/usr/bin/rsync -t --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir
share improve this answer edited May 3 '11 at 17:55 answered May 3 '11 at 17:48 chrishiestand 1,098 10
Awesome. Thank you. Looks like the second option is going to work for me and I found the first very interesting. – Fred Snertz May 3 '11 at 18:40
psst, hit the green checkbox to give my answer credit ;-) Thx. – chrishiestand May 12 '11 at 1:56

[Aug 28, 2017] Why does rsync fail to copy files from /sys in Linux?

Notable quotes:
"... pseudo file system ..."
"... pseudo filesystems ..."
Aug 28, 2017 |

up vote 11 down vote favorite 1

Eugene Yarmash , asked Apr 24 '13 at 16:35

I have a bash script which uses rsync to backup files in Archlinux. I noticed that rsync failed to copy a file from /sys , while cp worked just fine:
# rsync /sys/class/net/enp3s1/address /tmp    
rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61)
rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61)
ERROR: address failed verification -- update discarded.
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]

# cp  /sys/class/net/enp3s1/address /tmp   ## this works

I wonder why does rsync fail, and is it possible to copy the file with it?

mattdm , answered Apr 24 '13 at 18:20

Rsync has code which specifically checks if a file is truncated during read and gives this error ! ENODATA . I don't know why the files in /sys have this behavior, but since they're not real files, I guess it's not too surprising. There doesn't seem to be a way to tell rsync to skip this particular check.

I think you're probably better off not rsyncing /sys and using specific scripts to cherry-pick out the particular information you want (like the network card address).

Runium , answered Apr 25 '13 at 0:23

First off /sys is a pseudo file system . If you look at /proc/filesystems you will find a list of registered file systems where quite a few has nodev in front. This indicates they are pseudo filesystems . This means they exists on a running kernel as a RAM-based filesystem. Further they do not require a block device.
$ cat /proc/filesystems
nodev   sysfs
nodev   rootfs
nodev   bdev

At boot the kernel mount this system and updates entries when suited. E.g. when new hardware is found during boot or by udev .

In /etc/mtab you typically find the mount by:

sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0

For a nice paper on the subject read Patric Mochel's – The sysfs Filesystem .

stat of /sys files

If you go into a directory under /sys and do a ls -l you will notice that all files has one size. Typically 4096 bytes. This is reported by sysfs .

:/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ ls -l
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type
-r--r--r-- 1 root root 4096 Apr 24 20:09 address
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len

Further you can do a stat on a file and notice another distinct feature; it occupies 0 blocks. Also inode of root (stat /sys) is 1. /stat/fs typically has inode 2. etc.

rsync vs. cp

The easiest explanation for rsync failure of synchronizing pseudo files is perhaps by example.

Say we have a file named address that is 18 bytes. An ls or stat of the file reports 4096 bytes.

  1. Opens file descriptor, fd.
  2. Uses fstat(fd) to get information such as size.
  3. Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm . read_size == 4096
    1. Ask; read: 4096 bytes.
    2. A short string is read i.e. 18 bytes. nread == 18
    3. read_size = read_size - nread (4096 - 18 = 4078)
    4. Ask; read: 4078 bytes
    5. 0 bytes read (as first read consumed all bytes in file).
    6. nread == 0 , line 255
    7. Unable to read 4096 bytes. Zero out buffer.
    8. Set error ENODATA .
    9. Return.
  4. Report error.
  5. Retry. (Above loop).
  6. Fail.
  7. Report error.
  8. FINE.

During this process it actually reads the entire file. But with no size available it cannot validate the result – thus failure is only option.

  1. Opens file descriptor, fd.
  2. Uses fstat(fd) to get information such as st_size (also uses lstat and stat).
  3. Check if file is likely to be sparse. That is the file has holes etc.
    /* Use a heuristic to determine whether SRC_NAME contains any sparse
     * blocks.  If the file has fewer blocks than would normally be
     * needed for a file of its size, then at least one of the blocks in
     * the file is a hole.  */
    sparse_src = is_probably_sparse (&src_open_sb);

    As stat reports file to have zero blocks it is categorized as sparse.

  4. Tries to read file by extent-copy (a more efficient way to copy normal sparse files), and fails.
  5. Copy by sparse-copy.
    1. Starts out with max read size of MAXINT.
      Typically 18446744073709551615 bytes on a 32 bit system.
    2. Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.)
    3. A short string is read i.e. 18 bytes.
    4. Check if a hole is needed, nope.
    5. Write buffer to target.
    6. Subtract 18 from max read size.
    7. Ask; read 4096 bytes.
    8. 0 bytes as all got consumed in first read.
    9. Return success.
  6. All OK. Update flags for file.
  7. FINE.


Might be related, but extended attribute calls will fail on sysfs:

[root@hypervisor eth0]# lsattr address

lsattr: Inappropriate ioctl for device While reading flags on address

[root@hypervisor eth0]#

Looking at my strace it looks like rsync tries to pull in extended attributes by default:

22964 <... getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available)

I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn't able to find anything ( --xattrs turns them on at the destination).

[Aug 28, 2017] Rsync doesn't copy everyting s

Aug 28, 2017 |

View Full Version : [ubuntu] Rsync doesn't copy everyting

Scormen May 31st, 2009, 10:09 AM Hi all,

I'm having some trouble with rsync. I'm trying to sync my local /etc directory to a remote server, but this won't work.

The problem is that it seems he doesn't copy all the files.
The local /etc dir contains 15MB of data, after a rsync, the remote backup contains only 4.6MB of data.

Rsync is running by root. I'm using this command:

rsync --rsync-path="sudo rsync" -e "ssh -i /root/.ssh/backup" -avz --delete --delete-excluded -h --stats /etc kris@

I hope someone can help.


Scormen May 31st, 2009, 11:05 AM I found that if I do a local sync, everything goes fine.
But if I do a remote sync, it copies only 4.6MB.

Any idea?

LoneWolfJack May 31st, 2009, 05:14 PM never used rsync on a remote machine, but "sudo rsync" looks wrong. you probably can't call sudo like that so the ssh connection needs to have the proper privileges for executing rsync.

just an educated guess, though.

Scormen May 31st, 2009, 05:24 PM Thanks for your answer.

In /etc/sudoers I have added next line, so "sudo rsync" will work.

kris ALL=NOPASSWD: /usr/bin/rsync

I also tried without --rsync-path="sudo rsync", but without success.

I have also tried on the server to pull the files from the laptop, but that doesn't work either.

LoneWolfJack May 31st, 2009, 05:30 PM in the rsync help file it says that --rsync-path is for the path to rsync on the remote machine, so my guess is that you can't use sudo there as it will be interpreted as a path.

so you will have to do --rsync-path="/path/to/rsync" and make sure the ssh login has root privileges if you need them to access the files you want to sync.

--rsync-path="sudo rsync" probably fails because
a) sudo is interpreted as a path
b) the space isn't escaped
c) sudo probably won't allow itself to be called remotely

again, this is not more than an educated guess.

Scormen May 31st, 2009, 05:45 PM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@

Then I get this error:

sending incremental file list
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/pap": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/provider": Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.crt" -> "/etc/ssl/certs/ssl-cert-snakeoil.pem" failed: Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.key" -> "/etc/ssl/private/ssl-cert-snakeoil.key" failed: Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ppp/peers/provider": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ssl/private/ssl-cert-snakeoil.key": Permission denied (13)

sent 86.85K bytes received 306 bytes 174.31K bytes/sec
total size is 8.71M speedup is 99.97
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1058) [sender=3.0.5]

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Scormen June 1st, 2009, 09:00 AM Sorry for this bump.
I'm still having the same problem.

Any idea?


binary10 June 1st, 2009, 10:36 AM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@

Then I get this error:

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Maybe there's a nicer way but you could place /usr/bin/rsync into a private protected area and set the owner to root place the sticky bit on it and change your rsync-path argument such like:

# on the remote side, aka kris@
mkdir priv-area
# protect it from normal users running a priv version of rsync
chmod 700 priv-area
cd priv-area
cp -p /usr/local/bin/rsync ./rsync-priv
sudo chown 0:0 ./rsync-priv
sudo chmod +s ./rsync-priv
ls -ltra # rsync-priv should now be 'bold-red' in bash

Looking at your flags, you've specified a cvs ignore factor, ignore files that are updated on the target, and you're specifying a backup of removed files.

rsync -Cavuhzb --rsync-path="/home/kris/priv-area/rsync-priv" -e "ssh -i /root/.ssh/backup" /etc kris@

From those qualifiers you're not going to be getting everything sync'd. It's doing what you're telling it to do.

If you really wanted to perform a like for like backup.. (not keeping stuff that's been changed/deleted from the source. I'd go for something like the following.

rsync --archive --delete --hard-links --one-file-system --acls --xattrs --dry-run -i --rsync-path="/home/kris/priv-area/rsync-priv" --rsh="ssh -i /root/.ssh/backup" /etc/ kris@

Remove the --dry-run and -i when you're happy with the output, and it should do what you want. A word of warning, I get a bit nervous when not seeing trailing (/) on directories as it could lead to all sorts of funnies if you end up using rsync on softlinks.

Scormen June 1st, 2009, 12:19 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...


binary10 June 1st, 2009, 01:22 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...


Ok so I've gone back and looked at your original post, how are you calculating 15MB of data under etc - via a du -hsx /etc/ ??

I do daily drive to drive backup copies via rsync and drive to network copies.. and have used them recently for restoring.

Sure my du -hsx /etc/ reports 17MB of data of which 10MB gets transferred via an rsync. My backup drives still operate.

rsync 3.0.6 has some fixes to do with ACLs and special devices rsyncing between solaris. but I think 3.0.5 is still ok with ubuntu to ubuntu systems.

Here is my test doing exactly what you you're probably trying to do. I even check the remote end..

binary10@jsecx25:~/bin-priv$ ./rsync --archive --delete --hard-links --one-file-system --stats --acls --xattrs --human-readable --rsync-path="~/bin/rsync-priv-os-specific" --rsh="ssh" /etc/ rsyncbck@

Number of files: 3121
Number of files transferred: 1812
Total file size: 10.04M bytes
Total transferred file size: 10.00M bytes
Literal data: 10.00M bytes
Matched data: 0 bytes
File list size: 109.26K
File list generation time: 0.002 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 10.20M
Total bytes received: 38.70K

sent 10.20M bytes received 38.70K bytes 4.09M bytes/sec
total size is 10.04M speedup is 0.98

binary10@jsecx25:~/bin-priv$ sudo du -hsx /etc/
17M /etc/

And then on the remote system I do the du -hsx

binary10@lenovo-n200:/home/kris/backup/laptopkris/etc$ cd ..
binary10@lenovo-n200:/home/kris/backup/laptopkris$ sudo du -hsx etc
17M etc

Scormen June 1st, 2009, 01:35 PM ow are you calculating 15MB of data under etc - via a du -hsx /etc/ ??
Indeed, on my laptop I see:

root@laptopkris:/home/kris# du -sh /etc/
15M /etc/

If I do the same thing after a fresh sync to the server, I see:

root@server:/home/kris# du -sh /home/kris/backup/laptopkris/etc/
4.6M /home/kris/backup/laptopkris/etc/

On both sides, I have installed Ubuntu 9.04, with version 3.0.5 of rsync.
So strange...

binary10 June 1st, 2009, 01:45 PM it does seem a bit odd.

I'd start doing a few diffs from the outputs find etc/ -printf "%f %s %p %Y\n" | sort

And see what type of files are missing.

- edit - Added the %Y file type.

Scormen June 1st, 2009, 01:58 PM Hmm, it's going stranger.
Now I see that I have all my files on the server, but they don't have their full size (bytes).

I have uploaded the files, so you can look into them.


binary10 June 1st, 2009, 02:16 PM If you look at the files that are different aka the ssl's they are links to local files else where aka linked to /usr and not within /etc/

aka they are different on your laptop and the server

Scormen June 1st, 2009, 02:25 PM I understand that soft links are just copied, and not the "full file".

But, you have run the same command to test, a few posts ago.
How is it possible that you can see the full 15MB?

binary10 June 1st, 2009, 02:34 PM I was starting to think that this was a bug with du.

The de-referencing is a bit topsy.

If you rsync copy the remote backup back to a new location back onto the laptop and do the du command. I wonder if you'll end up with 15MB again.

Scormen June 1st, 2009, 03:20 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

binary10 June 1st, 2009, 03:34 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

I think you've now confirmed that RSYNC DOES copy everything.. just tht du confusing what you had expected by counting the end link sizes.

It might also think about what you're copying, maybe you need more than just /etc of course it depends on what you are trying to do with the backup :)


Scormen June 1st, 2009, 03:37 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?
binary10 June 1st, 2009, 04:23 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?

The links were copied as links as per the design of the --archive in rsync.

The contents of the pointing links were different between your two systems. These being that that reside outside of /etc/ in /usr And so DU reporting them differently.

Scormen June 1st, 2009, 05:36 PM Okay, I got it.
Many thanks for the support, binarty10!
Scormen June 1st, 2009, 05:59 PM Just to know, is it possible to copy the data from these links as real, hard data?
binary10 June 2nd, 2009, 09:54 AM Just to know, is it possible to copy the data from these links as real, hard data?

Yep absolutely

You should then look at other possibilities of:

-L, --copy-links transform symlink into referent file/dir
--copy-unsafe-links only "unsafe" symlinks are transformed
--safe-links ignore symlinks that point outside the source tree
-k, --copy-dirlinks transform symlink to a dir into referent dir
-K, --keep-dirlinks treat symlinked dir on receiver as dir

but then you'll have to start questioning why you are backing them up like that especially stuff under /etc/. If you ever wanted to restore it you'd be restoring full files and not symlinks the restore result could be a nightmare as well as create future issues (upgrades etc) let alone your backup will be significantly larger, could be 150MB instead of 4MB.

Scormen June 2nd, 2009, 10:04 AM Okay, now I'm sure what its doing :)
Is it also possible to show on a system the "real disk usage" of e.g. that /etc directory? So, without the links, that we get a output of 4.6MB.

Thank you very much for your help!

binary10 June 2nd, 2009, 10:22 AM What does the following respond with.

sudo du --apparent-size -hsx /etc

If you want the real answer then your result from a dry-run rsync will only be enough for you.

sudo rsync --dry-run --stats -h --archive /etc/ /tmp/etc/

[Jul 20, 2017] These Guys Didnt Back Up Their Files, Now Look What Happened

Notable quotes:
"... Unfortunately, even today, people have not learned that lesson. Whether it's at work, at home, or talking with friends, I keep hearing stories of people losing hundreds to thousands of files, sometimes they lose data worth actual dollars in time and resources that were used to develop the information. ..."
"... "I lost all my files from my hard drive? help please? I did a project that took me 3 days and now i lost it, its powerpoint presentation, where can i look for it? its not there where i save it, thank you" ..."
"... Please someone help me I last week brought a Toshiba Satellite laptop running windows 7, to replace my blue screening Dell vista laptop. On plugged in my sumo external hard drive to copy over some much treasured photos and some of my (work – music/writing.) it said installing driver. it said completed I clicked on the hard drive and found a copy of my documents from the new laptop and nothing else. ..."
Jul 20, 2017 |
Back in college, I used to work just about every day as a computer cluster consultant. I remember a month after getting promoted to a supervisor, I was in the process of training a new consultant in the library computer cluster. Suddenly, someone tapped me on the shoulder, and when I turned around I was confronted with a frantic graduate student – a 30-something year old man who I believe was Eastern European based on his accent – who was nearly in tears.

"Please need help – my document is all gone and disk stuck!" he said as he frantically pointed to his PC.

Now, right off the bat I could have told you three facts about the guy. One glance at the blue screen of the archaic DOS-based version of Wordperfect told me that – like most of the other graduate students at the time – he had not yet decided to upgrade to the newer, point-and-click style word processing software. For some reason, graduate students had become so accustomed to all of the keyboard hot-keys associated with typing in a DOS-like environment that they all refused to evolve into point-and-click users.

The second fact, gathered from a quick glance at his blank document screen and the sweat on his brow told me that he had not saved his document as he worked. The last fact, based on his thick accent, was that communicating the gravity of his situation wouldn't be easy. In fact, it was made even worse by his answer to my question when I asked him when he last saved.

"I wrote 30 pages."

Calculated out at about 600 words a page, that's 18000 words. Ouch.

Then he pointed at the disk drive. The floppy disk was stuck, and from the marks on the drive he had clearly tried to get it out with something like a paper clip. By the time I had carefully fished the torn and destroyed disk out of the drive, it was clear he'd never recover anything off of it. I asked him what was on it.

"My thesis."

I gulped. I asked him if he was serious. He was. I asked him if he'd made any backups. He hadn't.

Making Backups of Backups

If there is anything I learned during those early years of working with computers (and the people that use them), it was how critical it is to not only save important stuff, but also to save it in different places. I would back up floppy drives to those cool new zip drives as well as the local PC hard drive. Never, ever had a single copy of anything.

Unfortunately, even today, people have not learned that lesson. Whether it's at work, at home, or talking with friends, I keep hearing stories of people losing hundreds to thousands of files, sometimes they lose data worth actual dollars in time and resources that were used to develop the information.

To drive that lesson home, I wanted to share a collection of stories that I found around the Internet about some recent cases were people suffered that horrible fate – from thousands of files to entire drives worth of data completely lost. These are people where the only remaining option is to start running recovery software and praying, or in other cases paying thousands of dollars to a data recovery firm and hoping there's something to find.

Not Backing Up Projects

The first example comes from Yahoo Answers , where a user that only provided a "?" for a user name (out of embarrassment probably), posted:

"I lost all my files from my hard drive? help please? I did a project that took me 3 days and now i lost it, its powerpoint presentation, where can i look for it? its not there where i save it, thank you"

The folks answering immediately dove into suggesting that the person run recovery software, and one person suggested that the person run a search on the computer for *.ppt.

... ... ...

Doing Backups Wrong

Then, there's a scenario of actually trying to do a backup and doing it wrong, losing all of the files on the original drive. That was the case for the person who posted on Tech Support Forum , that after purchasing a brand new Toshiba Laptop and attempting to transfer old files from an external hard drive, inadvertently wiped the files on the hard drive.

Please someone help me I last week brought a Toshiba Satellite laptop running windows 7, to replace my blue screening Dell vista laptop. On plugged in my sumo external hard drive to copy over some much treasured photos and some of my (work – music/writing.) it said installing driver. it said completed I clicked on the hard drive and found a copy of my documents from the new laptop and nothing else.

While the description of the problem is a little broken, from the sound of it, the person thought they were backing up from one direction, while they were actually backing up in the other direction. At least in this case not all of the original files were deleted, but a majority were.

[Jul 18, 2017] Can I copy my Ubuntu OS off my hard drive to a USB stick and boot from that stick with all my programs

Yes, this is completely possible. First and foremost, you will need at least 2 USB ports available, or 1 USB port and 1 CD-Drive.

You start by booting into a Live-CD version of Ubuntu with your hard-drive where it is and the target device plugged into USB. Mount your internal drive and target USB to any paths you like.

Open up a terminal and enter the following commands:

tar cp --xattrs /path/to/internal | tar x /path/to/target/usb

You can also look into doing this through a live installation and a utility called CloneZilla, but I am unsure of exactly how to use CloneZilla. The above method is what I used to copy my 128GB hard-drive's installation of Ubuntu to a 64GB flash drive.

2) Clone again the internal or external drive in its entirety to another drive:

Use the "Clonezilla" utility, mentioned in the very last paragraph of my original answer, to clone the original internal drive to another external drive to make two such external bootable drives to keep track of. v>

[Feb 20, 2017] Using rsync to back up your Linux system

Feb 20, 2017 |
Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The --link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create.

Specify the previous day's target directory with this option and a new directory for today. rsync then creates today's new directory and a hard link for each file in yesterday's directory is created in today's directory. So we now have a bunch of hard links to yesterday's files in today's directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links . After creating the target directory for today with this set of hard links to yesterday's target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target.

So now our command looks like the following.

rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir

There are also times when it is desirable to exclude certain directories or files from being synchronized. For this, there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this.

rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir

Note that each file pattern you want to exclude must have a separate exclude option.

rsync can sync files with remote hosts as either the source or the target. For the next example, let's assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this.

rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir

This is the final form of my rsync backup command.

rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here.

[Feb 12, 2017] Easy Automated Snapshot-Style Backups with Linux and Rsync

Notable quotes:
"... illusion ..."
"... only one extra, slightly-larger, hard disk ..."
"... hard link ..."
"... what appears to be ..."
"... Putting it all together ..."
"... If you are rsync'ing from a SAMBA share, you must add --modify-window=10 ..."
Feb 12, 2017 |

page last modified 2004.01.04

Updates: As of rsync-2.5.6 , the --link-dest option is now standard! That can be used instead of the separate cp -al and rsync stages, and it eliminates the ownerships/permissions bug. I now recommend using it. Also, I'm proud to report this article is mentioned in Linux Server Hacks , a new (and very good, in my opinion) O'Reilly book by compiled by Rob Flickenger.

  1. Abstract
  2. Motivation
  3. Using rsync to make a backup
    1. Basics
    2. Using the --delete flag
    3. Be lazy: use cron
  4. Incremental backups with rsync
    1. Review of hard links
    2. Using cp -al
    3. Putting it all together
    4. I'm used to dump or tar ! This seems backward!
  5. Isolating the backup from the rest of the system
    1. The easy (bad) way
    2. Keep it on a separate partition
    3. Keep that partition on a separate disk
    4. Keep that disk on a separate machine
  6. Making the backup as read-only as possible
    1. Bad: mount / unmount
    2. Better: mount read-only most of the time
    3. Tempting but it doesn't seem to work: the 2.4 kernel's mount --bind
    4. My solution: using NFS on localhost
  7. Extensions: hourly, daily, and weekly snapshots
    1. Keep an extra script for each level
    2. Run it all with cron
  8. Known bugs and problems
    1. Maintaining Permissions and Owners in the snapshots
    2. mv updates timestamp bug
    3. Windows-related problems
  9. Appendix: my actual configuration
    1. Listing one:
    2. Listing two:
    3. Sample output of ls -l /snapshot/home
  10. Contributed codes
  11. References
  12. Frequently Asked Questions

This document describes a method for generating automatic rotating "snapshot"-style backups on a Unix-based system, with specific examples drawn from the author's GNU/Linux experience. Snapshot backups are a feature of some high-end industrial file servers; they create the illusion of multiple, full backups per day without the space or processing overhead. All of the snapshots are read-only, and are accessible directly by users as special system directories. It is often possible to store several hours, days, and even weeks' worth of snapshots with slightly more than 2x storage. This method, while not as space-efficient as some of the proprietary technologies (which, using special copy-on-write filesystems, can operate on slightly more than 1x storage), makes use of only standard file utilities and the common rsync program, which is installed by default on most Linux distributions. Properly configured, the method can also protect against hard disk failure, root compromises, or even back up a network of heterogeneous desktops automatically.


Note: what follows is the original sgvlug DEVSIG announcement.

Ever accidentally delete or overwrite a file you were working on? Ever lose data due to hard-disk failure? Or maybe you export shares to your windows-using friends--who proceed to get outlook viruses that twiddle a digit or two in all of their .xls files. Wouldn't it be nice if there were a /snapshot directory that you could go back to, which had complete images of the file system at semi-hourly intervals all day, then daily snapshots back a few days, and maybe a weekly snapshot too? What if every user could just go into that magical directory and copy deleted or overwritten files back into "reality", from the snapshot of choice, without any help from you? And what if that /snapshot directory were read-only, like a CD-ROM, so that nothing could touch it (except maybe root, but even then not directly)?

Best of all, what if you could make all of that happen automatically, using only one extra, slightly-larger, hard disk ? (Or one extra partition, which would protect against all of the above except disk failure).

In my lab, we have a proprietary NetApp file server which provides that sort of functionality to the end-users. It provides a lot of other things too, but it cost as much as a luxury SUV. It's quite appropriate for our heavy-use research lab, but it would be overkill for a home or small-office environment. But that doesn't mean small-time users have to do without!

I'll show you how I configured automatic, rotating snapshots on my $80 used Linux desktop machine (which is also a file, web, and mail server) using only a couple of one-page scripts and a few standard Linux utilities that you probably already have.

I'll also propose a related strategy which employs one (or two, for the wisely paranoid) extra low-end machines for a complete, responsible, automated backup strategy that eliminates tapes and manual labor and makes restoring files as easy as "cp".

Using rsync to make a backup

The rsync utility is a very well-known piece of GPL'd software, written originally by Andrew Tridgell and Paul Mackerras. If you have a common Linux or UNIX variant, then you probably already have it installed; if not, you can download the source code from . Rsync's specialty is efficiently synchronizing file trees across a network, but it works fine on a single machine too.


Suppose you have a directory called source , and you want to back it up into the directory destination . To accomplish that, you'd use:

rsync -a source/ destination/

(Note: I usually also add the -v (verbose) flag too so that rsync tells me what it's doing). This command is equivalent to:

cp -a source/. destination/

except that it's much more efficient if there are only a few differences.

Just to whet your appetite, here's a way to do the same thing as in the example above, but with destination on a remote machine, over a secure shell:

rsync -a -e ssh source/
Trailing Slashes Do Matter...Sometimes

This isn't really an article about rsync , but I would like to take a momentary detour to clarify one potentially confusing detail about its use. You may be accustomed to commands that don't care about trailing slashes. For example, if a and b are two directories, then cp -a a b is equivalent to cp -a a/ b/ . However, rsync does care about the trailing slash, but only on the source argument. For example, let a and b be two directories, with the file foo initially inside directory a . Then this command:

rsync -a a b

produces b/a/foo , whereas this command:

rsync -a a/ b

produces b/foo . The presence or absence of a trailing slash on the destination argument ( b , in this case) has no effect.

Using the --delete flag

If a file was originally in both source/ and destination/ (from an earlier rsync , for example), and you delete it from source/ , you probably want it to be deleted from destination/ on the next rsync . However, the default behavior is to leave the copy at destination/ in place. Assuming you want rsync to delete any file from destination/ that is not in source/ , you'll need to use the --delete flag:

rsync -a --delete source/ destination/
Be lazy: use cron

One of the toughest obstacles to a good backup strategy is human nature; if there's any work involved, there's a good chance backups won't happen. (Witness, for example, how rarely my roommate's home PC was backed up before I created this system). Fortunately, there's a way to harness human laziness: make cron do the work.

To run the rsync-with-backup command from the previous section every morning at 4:20 AM, for example, edit the root cron table: (as root)

crontab -e

Then add the following line:

20 4 * * * rsync -a --delete source/ destination/

Finally, save the file and exit. The backup will happen every morning at precisely 4:20 AM, and root will receive the output by email. Don't copy that example verbatim, though; you should use full path names (such as /usr/bin/rsync and /home/source/ ) to remove any ambiguity.

Incremental backups with rsync

Since making a full copy of a large filesystem can be a time-consuming and expensive process, it is common to make full backups only once a week or once a month, and store only changes on the other days. These are called "incremental" backups, and are supported by the venerable old dump and tar utilities, along with many others.

However, you don't have to use tape as your backup medium; it is both possible and vastly more efficient to perform incremental backups with rsync .

The most common way to do this is by using the rsync -b --backup-dir= combination. I have seen examples of that usage here , but I won't discuss it further, because there is a better way. If you're not familiar with hard links, though, you should first start with the following review.

Review of hard links

We usually think of a file's name as being the file itself, but really the name is a hard link . A given file can have more than one hard link to itself--for example, a directory has at least two hard links: the directory name and . (for when you're inside it). It also has one hard link from each of its sub-directories (the .. file inside each one). If you have the stat utility installed on your machine, you can find out how many hard links a file has (along with a bunch of other information) with the command:

stat filename

Hard links aren't just for directories--you can create more than one link to a regular file too. For example, if you have the file a , you can make a link called b :

ln a b

Now, a and b are two names for the same file, as you can verify by seeing that they reside at the same inode (the inode number will be different on your machine):

ls -i a
  232177 a
ls -i b
  232177 b

So ln a b is roughly equivalent to cp a b , but there are several important differences:

  1. The contents of the file are only stored once, so you don't use twice the space.
  2. If you change a , you're changing b , and vice-versa.
  3. If you change the permissions or ownership of a , you're changing those of b as well, and vice-versa.
  4. If you overwrite a by copying a third file on top of it, you will also overwrite b , unless you tell cp to unlink before overwriting. You do this by running cp with the --remove-destination flag. Notice that rsync always unlinks before overwriting!! . Note, added 2002.Apr.10: the previous statement applies to changes in the file contents only, not permissions or ownership.

But this raises an interesting question. What happens if you rm one of the links? The answer is that rm is a bit of a misnomer; it doesn't really remove a file, it just removes that one link to it. A file's contents aren't truly removed until the number of links to it reaches zero. In a moment, we're going to make use of that fact, but first, here's a word about cp .

Using cp -al

In the previous section, it was mentioned that hard-linking a file is similar to copying it. It should come as no surprise, then, that the standard GNU coreutils cp command comes with a -l flag that causes it to create (hard) links instead of copies (it doesn't hard-link directories, though, which is good; you might want to think about why that is). Another handy switch for the cp command is -a (archive), which causes it to recurse through directories and preserve file owners, timestamps, and access permissions.

Together, the combination cp -al makes what appears to be a full copy of a directory tree, but is really just an illusion that takes almost no space. If we restrict operations on the copy to adding or removing (unlinking) files--i.e., never changing one in place--then the illusion of a full copy is complete. To the end-user, the only differences are that the illusion-copy takes almost no disk space and almost no time to generate.

2002.05.15: Portability tip: If you don't have GNU cp installed (if you're using a different flavor of *nix, for example), you can use find and cpio instead. Simply replace cp -al a b with cd a && find . -print | cpio -dpl ../b . Thanks to Brage Førland for that tip.

Putting it all together

We can combine rsync and cp -al to create what appear to be multiple full backups of a filesystem without taking multiple disks' worth of space. Here's how, in a nutshell:

rm -rf backup.3
mv backup.2 backup.3
mv backup.1 backup.2
cp -al backup.0 backup.1
rsync -a --delete source_directory/  backup.0/

If the above commands are run once every day, then backup.0 , backup.1 , backup.2 , and backup.3 will appear to each be a full backup of source_directory/ as it appeared today, yesterday, two days ago, and three days ago, respectively--complete, except that permissions and ownerships in old snapshots will get their most recent values (thanks to J.W. Schultz for pointing this out). In reality, the extra storage will be equal to the current size of source_directory/ plus the total size of the changes over the last three days--exactly the same space that a full plus daily incremental backup with dump or tar would have taken.

Update (2003.04.23): As of rsync-2.5.6 , the --link-dest flag is now standard. Instead of the separate cp -al and rsync lines above, you may now write:

mv backup.0 backup.1
rsync -a --delete --link-dest=../backup.1 source_directory/  backup.0/

This method is preferred, since it preserves original permissions and ownerships in the backup. However, be sure to test it--as of this writing some users are still having trouble getting --link-dest to work properly. Make sure you use version 2.5.7 or later.

Update (2003.05.02): John Pelan writes in to suggest recycling the oldest snapshot instead of recursively removing and then re-creating it. This should make the process go faster, especially if your file tree is very large:

mv backup.3 backup.tmp
mv backup.2 backup.3
mv backup.1 backup.2
mv backup.0 backup.1
mv backup.tmp backup.0
cp -al backup.1/. backup.0
rsync -a --delete source_directory/ backup.0/

2003.06.02: OOPS! Rsync's link-dest option does not play well with J. Pelan's suggestion--the approach I previously had written above will result in unnecessarily large storage, because old files in backup.0 will get replaced and not linked. Please only use Dr. Pelan's directory recycling if you use the separate cp -al step; if you plan to use --link-dest , start with backup.0 empty and pristine. Apologies to anyone I've misled on this issue. Thanks to Kevin Everets for pointing out the discrepancy to me, and to J.W. Schultz for clarifying --link-dest 's behavior. Also note that I haven't fully tested the approach written above; if you have, please let me know. Until then, caveat emptor!

I'm used to dump or tar ! This seems backward!

The dump and tar utilities were originally designed to write to tape media, which can only access files in a certain order. If you're used to their style of incremental backup, rsync might seem backward. I hope that the following example will help make the differences clearer.

Suppose that on a particular system, backups were done on Monday night, Tuesday night, and Wednesday night, and now it's Thursday.

With dump or tar , the Monday backup is the big ("full") one. It contains everything in the filesystem being backed up. The Tuesday and Wednesday "incremental" backups would be much smaller, since they would contain only changes since the previous day. At some point (presumably next Monday), the administrator would plan to make another full dump.

With rsync, in contrast, the Wednesday backup is the big one. Indeed, the "full" backup is always the most recent one. The Tuesday directory would contain data only for those files that changed between Tuesday and Wednesday; the Monday directory would contain data for only those files that changed between Monday and Tuesday.

A little reasoning should convince you that the rsync way is much better for network-based backups, since it's only necessary to do a full backup once, instead of once per week. Thereafter, only the changes need to be copied. Unfortunately, you can't rsync to a tape, and that's probably why the dump and tar incremental backup models are still so popular. But in your author's opinion, these should never be used for network-based backups now that rsync is available.

Isolating the backup from the rest of the system

If you take the simple route and keep your backups in another directory on the same filesystem, then there's a very good chance that whatever damaged your data will also damage your backups. In this section, we identify a few simple ways to decrease your risk by keeping the backup data separate.

The easy (bad) way

In the previous section, we treated /destination/ as if it were just another directory on the same filesystem. Let's call that the easy (bad) approach. It works, but it has several serious limitations:

Fortunately, there are several easy ways to make your backup more robust.

Keep it on a separate partition

If your backup directory is on a separate partition, then any corruption in the main filesystem will not normally affect the backup. If the backup process runs out of disk space, it will fail, but it won't take the rest of the system down too. More importantly, keeping your backups on a separate partition means you can keep them mounted read-only; we'll discuss that in more detail in the next chapter.

Keep that partition on a separate disk

If your backup partition is on a separate hard disk, then you're also protected from hardware failure. That's very important, since hard disks always fail eventually, and often take your data with them. An entire industry has formed to service the needs of those whose broken hard disks contained important data that was not properly backed up.

Important : Notice, however, that in the event of hardware failure you'll still lose any changes made since the last backup. For home or small office users, where backups are made daily or even hourly as described in this document, that's probably fine, but in situations where any data loss at all would be a serious problem (such as where financial transactions are concerned), a RAID system might be more appropriate.

RAID is well-supported under Linux, and the methods described in this document can also be used to create rotating snapshots of a RAID system.

Keep that disk on a separate machine

If you have a spare machine, even a very low-end one, you can turn it into a dedicated backup server. Make it standalone, and keep it in a physically separate place--another room or even another building. Disable every single remote service on the backup server, and connect it only to a dedicated network interface on the source machine.

On the source machine, export the directories that you want to back up via read-only NFS to the dedicated interface. The backup server can mount the exported network directories and run the snapshot routines discussed in this article as if they were local. If you opt for this approach, you'll only be remotely vulnerable if:

  1. a remote root hole is discovered in read-only NFS, and
  2. the source machine has already been compromised.

I'd consider this "pretty good" protection, but if you're (wisely) paranoid, or your job is on the line, build two backup servers. Then you can make sure that at least one of them is always offline.

If you're using a remote backup server and can't get a dedicated line to it (especially if the information has to cross somewhere insecure, like the public internet), you should probably skip the NFS approach and use rsync -e ssh instead.

It has been pointed out to me that rsync operates far more efficiently in server mode than it does over NFS, so if the connection between your source and backup server becomes a bottleneck, you should consider configuring the backup machine as an rsync server instead of using NFS. On the downside, this approach is slightly less transparent to users than NFS--snapshots would not appear to be mounted as a system directory, unless NFS is used in that direction, which is certainly another option (I haven't tried it yet though). Thanks to Martin Pool, a lead developer of rsync , for making me aware of this issue.

Here's another example of the utility of this approach--one that I use. If you have a bunch of windows desktops in a lab or office, an easy way to keep them all backed up is to share the relevant files, read-only, and mount them all from a dedicated backup server using SAMBA. The backup job can treat the SAMBA-mounted shares just like regular local directories.

Making the backup as read-only as possible

In the previous section, we discussed ways to keep your backup data physically separate from the data they're backing up. In this section, we discuss the other side of that coin--preventing user processes from modifying backups once they're made.

We want to avoid leaving the snapshot backup directory mounted read-write in a public place. Unfortunately, keeping it mounted read-only the whole time won't work either--the backup process itself needs write access. The ideal situation would be for the backups to be mounted read-only in a public place, but at the same time, read-write in a private directory accessible only by root, such as /root/snapshot .

There are a number of possible approaches to the challenge presented by mounting the backups read-only. After some amount of thought, I found a solution which allows root to write the backups to the directory but only gives the users read permissions. I'll first explain the other ideas I had and why they were less satisfactory.

It's tempting to keep your backup partition mounted read-only as /snapshot most of the time, but unmount that and remount it read-write as /root/snapshot during the brief periods while snapshots are being made. Don't give in to temptation!.

Bad: mount / umount

A filesystem cannot be unmounted if it's busy--that is, if some process is using it. The offending process need not be owned by root to block an unmount request. So if you plan to umount the read-only copy of the backup and mount it read-write somewhere else, don't--any user can accidentally (or deliberately) prevent the backup from happening. Besides, even if blocking unmounts were not an issue, this approach would introduce brief intervals during which the backups would seem to vanish, which could be confusing to users.

Better: mount read-only most of the time

A better but still-not-quite-satisfactory choice is to remount the directory read-write in place:

mount -o remount,rw /snapshot
[ run backup process ]
mount -o remount,ro /snapshot

Now any process that happens to be in /snapshot when the backups start will not prevent them from happening. Unfortunately, this approach introduces a new problem--there is a brief window of vulnerability, while the backups are being made, during which a user process could write to the backup directory. Moreover, if any process opens a backup file for writing during that window, it will prevent the backup from being remounted read-only, and the backups will stay vulnerable indefinitely.

Tempting but doesn't seem to work: the 2.4 kernel's mount --bind

Starting with the 2.4-series Linux kernels, it has been possible to mount a filesystem simultaneously in two different places. "Aha!" you might think, as I did. "Then surely we can mount the backups read-only in /snapshot , and read-write in /root/snapshot at the same time!"

Alas, no. Say your backups are on the partition /dev/hdb1 . If you run the following commands,

mount /dev/hdb1 /root/snapshot
mount --bind -o ro /root/snapshot /snapshot

then (at least as of the 2.4.9 Linux kernel--updated, still present in the 2.4.20 kernel), mount will report /dev/hdb1 as being mounted read-write in /root/snapshot and read-only in /snapshot , just as you requested. Don't let the system mislead you!

It seems that, at least on my system, read-write vs. read-only is a property of the filesystem, not the mount point. So every time you change the mount status, it will affect the status at every point the filesystem is mounted, even though neither /etc/mtab nor /proc/mounts will indicate the change.

In the example above, the second mount call will cause both of the mounts to become read-only, and the backup process will be unable to run. Scratch this one.

Update: I have it on fairly good authority that this behavior is considered a bug in the Linux kernel, which will be fixed as soon as someone gets around to it. If you are a kernel maintainer and know more about this issue, or are willing to fix it, I'd love to hear from you!

My solution: using NFS on localhost

This is a bit more complicated, but until Linux supports mount --bind with different access permissions in different places, it seems like the best choice. Mount the partition where backups are stored somewhere accessible only by root, such as /root/snapshot . Then export it, read-only, via NFS, but only to the same machine. That's as simple as adding the following line to /etc/exports :


then start nfs and portmap from /etc/rc.d/init.d/ . Finally mount the exported directory, read-only, as /snapshot :

mount -o ro /snapshot

And verify that it all worked:

/dev/hdb1 on /root/snapshot type ext3 (rw) on /snapshot type nfs (ro,addr=

At this point, we'll have the desired effect: only root will be able to write to the backup (by accessing it through /root/snapshot ). Other users will see only the read-only /snapshot directory. For a little extra protection, you could keep mounted read-only in /root/snapshot most of the time, and only remount it read-write while backups are happening.

Damian Menscher pointed out this CERT advisory which specifically recommends against NFS exporting to localhost, though since I'm not clear on why it's a problem, I'm not sure whether exporting the backups read-only as we do here is also a problem. If you understand the rationale behind this advisory and can shed light on it, would you please contact me? Thanks!

Extensions: hourly, daily, and weekly snapshots

With a little bit of tweaking, we make multiple-level rotating snapshots. On my system, for example, I keep the last four "hourly" snapshots (which are taken every four hours) as well as the last three "daily" snapshots (which are taken at midnight every day). You might also want to keep weekly or even monthly snapshots too, depending upon your needs and your available space.

Keep an extra script for each level

This is probably the easiest way to do it. I keep one script that runs every four hours to make and rotate hourly snapshots, and another script that runs once a day rotate the daily snapshots. There is no need to use rsync for the higher-level snapshots; just cp -al from the appropriate hourly one.

Run it all with cron

To make the automatic snapshots happen, I have added the following lines to root's crontab file:

0 */4 * * * /usr/local/bin/
0 13 * * *  /usr/local/bin/

They cause to be run every four hours on the hour and to be run every day at 13:00 (that is, 1:00 PM). I have included those scripts in the appendix.

If you tire of receiving an email from the cron process every four hours with the details of what was backed up, you can tell it to send the output of to /dev/null , like so:

0 */4 * * * /usr/local/bin/ >/dev/null 2>&1

Understand, though, that this will prevent you from seeing errors if cannot run for some reason, so be careful with it. Creating a third script to check for any unusual behavior in the snapshot periodically seems like a good idea, but I haven't implemented it yet. Alternatively, it might make sense to log the output of each run, by piping it through tee , for example. mRgOBLIN wrote in to suggest a better (and obvious, in retrospect!) approach, which is to send stdout to /dev/null but keep stderr, like so:

0 */4 * * * /usr/local/bin/ >/dev/null

Presto! Now you only get mail when there's an error. :)

Appendix: my actual configuration

I know that listing my actual backup configuration here is a security risk; please be kind and don't use this information to crack my site. However, I'm not a security expert, so if you see any vulnerabilities in my setup, I'd greatly appreciate your help in fixing them. Thanks!

I actually use two scripts, one for every-four-hours (hourly) snapshots, and one for every-day (daily) snapshots. I am only including the parts of the scripts that relate to backing up /home , since those are relevant ones here.

I use the NFS-to-localhost trick of exporting /root/snapshot read-only as /snapshot , as discussed above.

The system has been running without a hitch for months.

Listing one:
# ----------------------------------------------------------------------
# mikes handy rotating-filesystem-snapshot utility
# ----------------------------------------------------------------------
# this needs to be a lot more general, but the basic idea is it makes
# rotating backup-snapshots of /home whenever called
# ----------------------------------------------------------------------

unset PATH	# suggestion from H. Milz: avoid accidental use of $PATH

# ------------- system commands used by this script --------------------



# ------------- file locations -----------------------------------------


# ------------- the script itself --------------------------------------

# make sure we're running as root
if (( `$ID -u` != 0 )); then { $ECHO "Sorry, must be root.  Exiting..."; exit; } fi

# attempt to remount the RW mount point as RW; else abort
if (( $? )); then
	$ECHO "snapshot: could not remount $SNAPSHOT_RW readwrite";

# rotating snapshots of /home (fixme: this should be more general)

# step 1: delete the oldest snapshot, if it exists:
if [ -d $SNAPSHOT_RW/home/hourly.3 ] ; then			\
$RM -rf $SNAPSHOT_RW/home/hourly.3 ;				\
fi ;

# step 2: shift the middle snapshots(s) back by one, if they exist
if [ -d $SNAPSHOT_RW/home/hourly.2 ] ; then			\
$MV $SNAPSHOT_RW/home/hourly.2 $SNAPSHOT_RW/home/hourly.3 ;	\
if [ -d $SNAPSHOT_RW/home/hourly.1 ] ; then			\
$MV $SNAPSHOT_RW/home/hourly.1 $SNAPSHOT_RW/home/hourly.2 ;	\

# step 3: make a hard-link-only (except for dirs) copy of the latest snapshot,
# if that exists
if [ -d $SNAPSHOT_RW/home/hourly.0 ] ; then			\
$CP -al $SNAPSHOT_RW/home/hourly.0 $SNAPSHOT_RW/home/hourly.1 ;	\

# step 4: rsync from the system into the latest snapshot (notice that
# rsync behaves like cp --remove-destination by default, so the destination
# is unlinked first.  If it were not so, this would copy over the other
# snapshot(s) too!
$RSYNC								\
	-va --delete --delete-excluded				\
	--exclude-from="$EXCLUDES"				\
	/home/ $SNAPSHOT_RW/home/hourly.0 ;

# step 5: update the mtime of hourly.0 to reflect the snapshot time
$TOUCH $SNAPSHOT_RW/home/hourly.0 ;

# and thats it for home.

# now remount the RW snapshot mountpoint as readonly

if (( $? )); then
	$ECHO "snapshot: could not remount $SNAPSHOT_RW readonly";
} fi;

As you might have noticed above, I have added an excludes list to the rsync call. This is just to prevent the system from backing up garbage like web browser caches, which change frequently (so they'd take up space in every snapshot) but would be no loss if they were accidentally destroyed.

Listing two:
# ----------------------------------------------------------------------
# mikes handy rotating-filesystem-snapshot utility: daily snapshots
# ----------------------------------------------------------------------
# intended to be run daily as a cron job when hourly.3 contains the
# midnight (or whenever you want) snapshot; say, 13:00 for 4-hour snapshots.
# ----------------------------------------------------------------------

unset PATH

# ------------- system commands used by this script --------------------


# ------------- file locations -----------------------------------------


# ------------- the script itself --------------------------------------

# make sure we're running as root
if (( `$ID -u` != 0 )); then { $ECHO "Sorry, must be root.  Exiting..."; exit; } fi

# attempt to remount the RW mount point as RW; else abort
if (( $? )); then
	$ECHO "snapshot: could not remount $SNAPSHOT_RW readwrite";

# step 1: delete the oldest snapshot, if it exists:
if [ -d $SNAPSHOT_RW/home/daily.2 ] ; then			\
$RM -rf $SNAPSHOT_RW/home/daily.2 ;				\
fi ;

# step 2: shift the middle snapshots(s) back by one, if they exist
if [ -d $SNAPSHOT_RW/home/daily.1 ] ; then			\
$MV $SNAPSHOT_RW/home/daily.1 $SNAPSHOT_RW/home/daily.2 ;	\
if [ -d $SNAPSHOT_RW/home/daily.0 ] ; then			\
$MV $SNAPSHOT_RW/home/daily.0 $SNAPSHOT_RW/home/daily.1;	\

# step 3: make a hard-link-only (except for dirs) copy of
# hourly.3, assuming that exists, into daily.0
if [ -d $SNAPSHOT_RW/home/hourly.3 ] ; then			\
$CP -al $SNAPSHOT_RW/home/hourly.3 $SNAPSHOT_RW/home/daily.0 ;	\

# note: do *not* update the mtime of daily.0; it will reflect
# when hourly.3 was made, which should be correct.

# now remount the RW snapshot mountpoint as readonly

if (( $? )); then
	$ECHO "snapshot: could not remount $SNAPSHOT_RW readonly";
} fi;
Sample output of ls -l /snapshot/home
total 28
drwxr-xr-x   12 root     root         4096 Mar 28 00:00 daily.0
drwxr-xr-x   12 root     root         4096 Mar 27 00:00 daily.1
drwxr-xr-x   12 root     root         4096 Mar 26 00:00 daily.2
drwxr-xr-x   12 root     root         4096 Mar 28 16:00 hourly.0
drwxr-xr-x   12 root     root         4096 Mar 28 12:00 hourly.1
drwxr-xr-x   12 root     root         4096 Mar 28 08:00 hourly.2
drwxr-xr-x   12 root     root         4096 Mar 28 04:00 hourly.3

Notice that the contents of each of the subdirectories of /snapshot/home/ is a complete image of /home at the time the snapshot was made. Despite the w in the directory access permissions, no one--not even root--can write to this directory; it's mounted read-only.

Bugs Maintaining Permissions and Owners in the snapshots

The snapshot system above does not properly maintain old ownerships/permissions; if a file's ownership or permissions are changed in place, then the new ownership/permissions will apply to older snapshots as well. This is because rsync does not unlink files prior to changing them if the only changes are ownership/permission. Thanks to J.W. Schultz for pointing this out. Using his new --link-dest option, it is now trivial to work around this problem. See the discussion in the Putting it all together section of Incremental backups with rsync , above.

mv updates timestamp bug

Apparently, a bug in some Linux kernels between 2.4.4 and 2.4.9 causes mv to update timestamps; this may result in inaccurate timestamps on the snapshot directories. Thanks to Claude Felizardo for pointing this problem out. He was able to work around the problem my replacing mv with the following script:

function my_mv() {
   touch -r $1 $REF;
   /bin/mv $1 $2;
   touch -r $REF $2;
   /bin/rm $REF;
Windows-related problems

I have recently received a few reports of what appear to be interaction issues between Windows and rsync.

One report came from a user who mounts a windows share via Samba, much as I do, and had files mysteriously being deleted from the backup even when they weren't deleted from the source. Tim Burt also used this technique, and was seeing files copied even when they hadn't changed. He determined that the problem was modification time precision; adding --modify-window=10 caused rsync to behave correctly in both cases. If you are rsync'ing from a SAMBA share, you must add --modify-window=10 or you may get inconsistent results. Update: --modify-window=1 should be sufficient. Yet another update: the problem appears to still be there. Please let me know if you use this method and files which should not be deleted are deleted.

Also, for those who use rsync directly on cygwin, there are some known problems, apparently related to cygwin signal handling. Scott Evans reports that rsync sometimes hangs on large directories. Jim Kleckner informed me of an rsync patch, discussed here and here , which seems to work around this problem. I have several reports of this working, and two reports of it not working (the hangs continue). However, one of the users who reported a negative outcome, Greg Boyington, was able to get it working using Craig Barrett's suggested sleep() approach, which is documented here .

Memory use in rsync scales linearly with the number of files being sync'd. This is a problem when syncing large file trees, especially when the server involved does not have a lot of RAM. If this limitation is more of an issue to you than network speed (for example, if you copy over a LAN), you may wish to use mirrordir instead. I haven't tried it personally, but it looks promising. Thanks to Vladimir Vuksan for this tip!

Contributed codes

Several people have been kind enough to send improved backup scripts. There are a number of good ideas here, and I hope they'll save you time when you're ready to design your own backup plan. Disclaimer: I have not necessarily tested these; make sure you check the source code and test them thoroughly before use!

References Frequently Asked Questions

[Feb 04, 2017] 20 Unix Command Line Tricks – Part I

Feb 04, 2017 |
Build directory trees in a single command

You can create directory trees one at a time using mkdir command by passing the -p option:

mkdir -p /jail/{dev,bin,sbin,etc,usr,lib,lib64}
ls -l /jail
> > > >

[Dec 26, 2016] Here is my top 5 backup tools in Linux

March 26, 2016 OSTechNix

Data is the backbone of a Company. So, performing backup on regular intervals is one of the vital role of a system administrator. Here is my favourite five backup tools that I use mostly. I won't say these are the best, but these are the backup tools which I considered first when it comes to data backup.

Let me explain some of my preferred backup tools.


BACULA is a power full backup tool . It is easy to use and efficient in recovering of loss data and damaged files in the local system and remotely. It having rich user interface( UI ) . It works on different cross platforms like windows, and Mac OS X.

Concerning about BACULA features, I can list the following:

  1. SD-SD replication.
  2. Enterprise binaries avaliable for univention.
  3. Restore performance improved for hard data files.
  4. Periodic status on running jobs in Director status report.

BACULA has the following components.


FWBACKUPS is the easiest of all backup tools in linux. It having the rich user interface, and also it is a cross platform tool.

One of the notable feature of FWBACKUPS is remote backup. We can backup data from various systems remotely.

FWBACKUPS having some features are listed below.

  1. Simple Interface – Backup and restoring the documents is simple for user.
  2. Cross – platform – It's supports different platforms like windows, and Mac OS X. It restores the data on one system and restores into another system.
  3. Remote backup – All types of files can handle remotely.
  4. Scheduled Backups – Run a backup once or periodically.
  5. Speed – Backups moves faster by copying only the changes.
  6. Organized and clean – It takes care about organized data and removal of expired one. It list the backup to restore from which list of date.


RSYNC is a widely used tool for backups in linux. It is a command line backup tool. RSYNC is used to collect data remotely and locally. It is mainly used for automated backup. We can automate backup jobs with scripts.

Some of the notable features are listed below:

  1. It can update whole directory trees and filesystems.
  2. It uses ssh, rsh or direct sockets as the transport.
  3. Supports anonymous rsync which is ideal for mirroring.
  4. We can set bandwidth limit and file size.

URBACKUP is a client/server backup system. It's efficient in client/server backup system for both windows and linux environments. File and image backups are made while the system is running without interrupting current process.

Here is the some features of this tool:

  1. whole partition can be saved as single directory.
  2. Image and file backup are made while system is running.
  3. Fast file and image transmission.
  4. Clients have the flexibility to change the settings like backup frequency. Next to no configuration.
  5. Web interface of URBACKUP is good in showing the status of the clients, current status of backup issues.


BACKUP PC is high performance, enterprise-grade backup tool. It is a high configurable and easy to install, use and maintain.

It reduces the cost of the disks and raid system. BACKUP PC is written in perl language and extracts data using Samba service.

It is robust, reliable, well documented and freely available as open source on Sourceforge .


  1. No client side software needed. The standard smb protocol is used to extract backup data.
  2. A powerful web interface provides log details to view log files, configuration, current status and allows user to initiate and cancelled backups and browse and restore files from backups.
  3. It supports mobile environment where laptops are only intermittently connected to the network and have dynamic IP address.
  4. Users will receive email remainders if their pc has not recently been backed up.
  5. Open source and freely available under GPL.

These are the top backup tools that I use mostly. What's your favourite? Let us know in the comment section below.

Thanks for stopping by.


[Nov 05, 2016] Relax-and-Recover – Freecode

Nov 05, 2016 |

Relax-and-Recover (Rear) is a bare metal disaster recovery and system migration solution, similar to AIX mksysb or HP-UX ignite. It is composed of a modular framework and ready-to-go workflows for many common situations to produce a bootable image and restore from backup using this image. It can restore to different hardware, and can therefore be used as a migration tool as well. It supports various boot media (including tape, USB, or eSATA storage, ISO, PXE, etc.), a variety of network protocols (including SFTP, FTP, HTTP, NFS, and CIFS), as well as a multitude of backup strategies (including IBM TSM, HP DataProtector, Symantec NetBackup, Bacula, and rsync). It was designed to be easy to set up, requires no maintenance, and is there to assist when disaster strikes. Recovering from disaster is made very straight-forward by a 2-step recovery process so that it can be executed by operational teams when required. When used interactively (e.g. when used for migrating systems), menus help make decisions to restore to a new (hardware) environment.

Release Notes: Integrated with duply/duplicity support. systemd support has been added. Various small fixes and improvements to tape support, Xen, PPC, Gentoo, Fedora, multi-arch, storage ... layout configuration, and serial console integration.


Release Notes: This release adds support for multipathing, adds several improvements to distribution backward compatibility, improves ext4 support, makes various bugfixes, migrates HWADDR ... after rescovery, and includes better systemd support.


Release Notes: Multi-system and multi-copy support on USB storage devices. Basic rsync backup support. More extensive exclude options. The new layout code is enabled by default. Support ... for Arch Linux. Improved multipath support. Experimental btrfs support.


Release Notes: Standardization of the command line. The default is quiet output; use the option -v for the old behavior. Boot images now have a comprehensive boot menu. Support for IPv6 ... addresses. Restoring NBU backup from a point in time is supported. Support for Fedora 15 (systemd) and RHEL6/SL6. Improved handling of HP SmartArray. Support for ext4 on RHEL5/SL5. Support for Xen paravirtualization. Integration with the local GRUB menu. Boot images can now be centralized through network transfers. Support for udev on RHEL4. Many small improvements and performance enhancements.


Release Notes: This release supports many recent distributions, including "upstart" (Ubuntu 7.10). It has more IA-64 support (RHEL5 only at the moment), better error reporting and catching, ... Debian packages (mkdeb), and improved TSM support.


[Nov 05, 2016] Relax and Recover – How Did I Do That

21 August 2014

Start a backup on the CentOS machine

Add the following lines to /etc/rear/local.conf:

BACKUP_PROG_EXCLUDE=( '/tmp/*' '/dev/shm/*' )

BACKUP_PROG_EXCLUDE=( '/tmp/*' '/dev/shm/*' )

Now make a backup

[root@centos7 ~]# rear mkbackup -v
Relax-and-Recover 1.16.1 / Git
Using log file: /var/log/rear/rear-centos7.log
mkdir: created directory '/var/lib/rear/output'
Creating disk layout
Creating root filesystem layout
TIP: To login as root via ssh you need to set up /root/.ssh/authorized_keys or SSH_ROOT_PASSWORD in your configuration file
Copying files and directories
Copying binaries and libraries
Copying kernel modules
Creating initramfs
Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-centos7.iso (90M)
Copying resulting files to nfs location
Encrypting disabled
Creating tar archive '/tmp/rear.QnDt1Ehk25Vqurp/outputfs/centos7/2014-08-21-1548-F.tar.gz'
Archived 406 MiB [avg 3753 KiB/sec]OK
Archived 406 MiB in 112 seconds [avg 3720 KiB/sec]

Now look on your NFS server

You'll see all the files you'll need to perform the disaster recovery.

total 499M
drwxr-x- 2 root root 4.0K Aug 21 23:51 .
drwxr-xr-x 3 root root 4.0K Aug 21 23:48 ..
-rw--- 1 root root 407M Aug 21 23:51 2014-08-21-1548-F.tar.gz
-rw--- 1 root root 2.2M Aug 21 23:51 backup.log
-rw--- 1 root root 202 Aug 21 23:49 README
-rw--- 1 root root 90M Aug 21 23:49 rear-centos7.iso
-rw--- 1 root root 161K Aug 21 23:49 rear.log
-rw--- 1 root root 0 Aug 21 23:51 selinux.autorelabel
-rw--- 1 root root 277 Aug 21 23:49 VERSION

Author: masterdam79

You can also connect with me on Google+ View all posts by masterdam79

Author masterdam79/
Posted on 21 August 2014/

dheeraj says:

31 August 2016 at 02:26

is it possible to give list of directories or mount points while giving mkbackup to exclude from backup. Like giving a file with list of all directories that need to be excluded ??

masterdam79 says:

26 September 2016 at 21:50

Have a look at
Should be possible if you ask me.

[Nov 04, 2016] Coding Style rear-rear Wiki

Reading rear sources is an interesting exercise. It really demonstrates attempt to use "reasonable' style of shell programming and you can learn a lot.
Nov 04, 2016 |

Relax-and-Recover is written in Bash (at least bash version 3 is needed), a language that can be used in many styles. We want to make it easier for everybody to understand the Relax-and-Recover code and subsequently to contribute fixes and enhancements.

Here is a collection of coding hints that should help to get a more consistent code base.

Don't be afraid to contribute to Relax-and-Recover even if your contribution does not fully match all this coding hints. Currently large parts of the Relax-and-Recover code are not yet in compliance with this coding hints. This is an ongoing step by step process. Nevertheless try to understand the idea behind this coding hints so that you know how to break them properly (i.e. "learn the rules so you know how to break them properly").

The overall idea behind this coding hints is:

Make yourself understood

Make yourself understood to enable others to fix and enhance your code properly as needed.

From this overall idea the following coding hints are derived.

For the fun of it an extreme example what coding style should be avoided:

#!/bin/bash for i in `seq 1 2 $((2*$1-1))`;do echo $((j+=i));done


Try to find out what that code is about - it does a useful thing.

Code must be easy to read Code should be easy to understand

Do not only tell what the code does (i.e. the implementation details) but also explain what the intent behind is (i.e. why ) to make the code maintainable.

Here the initial example so that one can understand what it is about:

#!/bin/bash # output the first N square numbers # by summing up the first N odd numbers 1 3 ... 2*N-1 # where each nth partial sum is the nth square number # see # this way it is a little bit faster for big N compared to # calculating each square number on its own via multiplication N=$1 if ! [[ $N =~ ^[0-9]+$ ]] ; then echo "Input must be non-negative integer." 1>&2 exit 1 fi square_number=0 for odd_number in $( seq 1 2 $(( 2 * N - 1 )) ) ; do (( square_number += odd_number )) && echo $square_number done

Now the intent behind is clear and now others can easily decide if that code is really the best way to do it and easily improve it if needed.

Try to care about possible errors

By default bash proceeds with the next command when something failed. Do not let your code blindly proceed in case of errors because that could make it hard to find the root cause of a failure when it errors out somewhere later at an unrelated place with a weird error message which could lead to false fixes that cure only a particular symptom but not the root cause.

Maintain Backward Compatibility

Implement adaptions and enhancements in a backward compatible way so that your changes do not cause regressions for others.

Dirty hacks welcome

When there are special issues on particular systems it is more important that the Relax-and-Recover code works than having nice looking clean code that sometimes fails. In such special cases any dirty hacks that intend to make it work everywhere are welcome. But for dirty hacks the above listed coding hints become mandatory rules:

For example a dirty hack like the following is perfectly acceptable:

# FIXME: Dirty hack to make it work # on "FUBAR Linux version 666" # where COMMAND sometimes inexplicably fails # but always works after at most 3 attempts # see # Retries should have no bad effect on other systems # where the first run of COMMAND works. COMMAND || COMMAND || COMMAND || Error "COMMAND failed."

Character Encoding

Use only traditional (7-bit) ASCII charactes. In particular do not use UTF-8 encoded multi-byte characters.

Text Layout Variables Functions Relax-and-Recover functions

Use the available Relax-and-Recover functions when possible instead of re-implementing basic functionality again and again. The Relax-and-Recover functions are implemented in various lib/* files .

test, [, [[, (( Paired parenthesis See also

Admin's Choice - Solaris & Unix Discussion Forums BTIPS Backup commands - ufsdump , tar , cpio-B

Identifying the tape device
dmesg | grep st

Checking the status of the tape drive
mt -f /dev/rmt/0 status

Backup file system using ufsdump
ufsdump 0cvf /dev/rmt/0 /dev/rdsk/c0t0d0s0
ufsdump 0cvf /dev/rmt/0 /usr

To restore a dump with ufsrestore
ufsrestore rvf /dev/rmt/0

ufsrestore in interactive mode allowing selection of individual files and directories using add , ls , cd , pwd and extract commands .
ufsrestore -i /dev/rmt/0

Making a copy of a disk slice using ufsdump
ufsdump 0f - /dev/rdsk/c0t0d0s7 |(cd /mnt/backup ;ufsrestore xf -)

Backing up all files in a directory including subdirectories to a tape device (/dev/rmt/0),
tar cvf /dev/rmt/0 *

Viewing a tar backup on a tape
tar tvf /dev/rmt/0

Extracting tar backup from the tape
tar xvf /dev/rmt/0
(Restoration will go to present directory or original backup path depending on relative or absolute path names used for backup )

Backup using cpio
find . -depth -print | cpio -ovcB > /dev/rmt/0

Viewing cpio files on a tape
cpio -ivtB < /dev/rmt/0

Restoring a cpio backup
cpio -ivcB < /dev/rmt/0

Compressing a file
compress -v file_name

gzip filename
To uncompress a file
uncompress file_name.Z
gunzip filename

The Solaris Companion: Reliable and Practical Root Disk MirroringSys Admin Magazinecolumn by Peter Baer Galvin

The Best of All Worlds
The solution is to combine these two products. Through quite a bit of work, you can use Disksuite to mirror the root disks, but carve out a small partition and make that the rootdg. The effort is worth while, as this solution meets all four of the criteria:

A Tool for Cold Mirroring of Solaris System Disks

If the system disks (/, /usr, /var file systems) are on RAID and for example the raid controller (or fiber cable) fails, you have a problem, unless the RAID is fully redundant. Also, cold mirroring is simpler, and software RAID can be difficult to recover when the system disk fails.

For some servers, I prefer to put system (and certain data) files on a "normal" disk and mirror to a second disk once or twice a week ("cold mirroring"). If the boot disk dies, we simply boot from the mirror disk. This solution is easier to understand, to recover from in a disaster scenario, and system disks can be more easily added/removed/changed.

In addition, files changed by accident can be recovered since the last mirror run, and deleted files can be recovered until the disk fills up and needs to be wiped clean. More details are provided below.

Each night the offline disk is mounted and synchronized with the primary disk. The script is called from the root cron nightly. It mounts the spare disk under /newroot, copies all file systems, installs a boot block and copies over a new vfstab. This creates a fully updated bootable spare disk. The results of the script are sent to the administrator via email (sample output is mirror_output.txt).

Linux and Solaris ACLs - Backup


POSIX 1003.1-2001 defines a backup utility called pax, and along with that utility, a revised archive format that is to a large degree backwards compatible with tar's archive format. This format is extensible and can contain vendor specific extensions. Additional information that is added to this format is stored in extended headers.

The Star tape archiver uses this backup format for Access Control Lists.

Star tape archiver

The Star tape archiver by Jörg Schilling, available at, since version 1.4a07 supports backing up and restoring of POSIX Access Control Lists. For best results, it is recommended to use a recent star-1.5 version. Star is compatible with SUSv2 tar (UNIX-98 tar), understands the GNU tar archive extensions, and can generate pax archives.

Getting and building Star

Star snapshots are available at

Solaris always includes ACL support in the base OS since Solaris-2.5, but before building Star on Linux, you first need to install the ACL utilities. The ACL utilities in Linux include the ACL library, which Star depends on.

To build Star under Linux, unpack the Star archive, change into the star-1.5 directory, and invoke make. The Star package of course contains more detailed information.

Backing up and restoring with Star

Star supports all command line parameters defined for SUSv2 tar (UNIX-98 tar). There are some differences with GNU tar, for which mostly GNU tar is to blame. Archives can be created as follows. The H=exustar option tells star to create an extended pax archive. The Option -acl tells star to include ACLs in extended headers, for those files that have ACLs.

star H=exustar -acl -c path > archive.tar

Archives can be restored as shown below.

star -acl -x < archive.tar

The archive format Star uses for Access Control Lists

Since no official backup format for POSIX access control lists has been defined, Star uses the vendor defined attributes SCHILY.acl.access and SCHILY.acl.default for storing the ACL and Default ACL of a file, respectively. The access control lists are stored in the short text form as defined in POSIX 1003.1e draft standard 17. To each named user ACL entry a fourth colon separated field field containing the user identifier (UID) of the associated user is appended. To each named group entry a fourth colon separated field containing the group identifier (GID) of the associated group is appended. (POSIX 1003.1e draft standard 17 allows to add fields to ACL entries.)

This is an example of the format used (lines broken for readability, additional fields highlighted):

SCHILY.acl.access= user::rwx,user:lisa:r-x:502,group::r-x, \

SCHILY.acl.default= user::rwx,user:lisa:r-x:502,group::r-x, \

The numerical user and group identifiers are essential when restoring a system completely from a backup, as initially the name-to-identifier mappings may not be available, and then file ownership restoration would not work.

As the archive format that is used for backing up access control lists is compatible with the pax archive format, archives created that way can be restored by star or a POSIX.1-2001 compliant pax. Note that programs other than star will ignore the ACL information.

A Solaris Backup Script How-To

This paper will focus on the backup script and will detail a flexible backup script that uses built-in Solaris software tools which create a reliable local backup ...

Recommended Links

Softpanorama hot topic of the month

Softpanorama Recommended

Top articles


Restoring a Sun system using JumpStart technology -- Online Catalog Essential System Administration, 3rd Edition Chapter 11: Backup and Restore (PDF)

Torture-testing Backup and Archive Programs Things You Ought to Know But Probably Would Rather Not Appendix A Tables of Evaluations

Sys Admin Magazinev12, i11 More Truth about Tapes, Backups, and Restores

Sys Admin Magazinev12, i07 The Truth about Tapes, Backups, and Restores

Elizabeth D. Zwicky's backup test suite (mirror)

Protecting File Systems: A Survey of Backup Techniques (1998)

Welcome to the Free pax utilities site Project details for S tar by Jörg Schilling

Star is a very fast, POSIX-compliant tar archiver. It saves many files together into a single tape or disk archive, and can restore individual files from the archive. It includes command line interfaces for the "tar", "Sun-Tar", "cpio", "pax", and "gnutar" command-line syntax. It includes a FIFO for speed, a pattern matcher, multi-volume support, the ability to archive sparse files and ACLs, the ability to archive extended file flags, automatic archive format detection, automatic byte order recognition, automatic archive compression/decompression, remote archives, and special features that allow star to be used for full and incremental backups. It includes the only known platform independent "rmt" server program.

Tutorials -- Online Catalog Essential System Administration, 3rd Edition Chapter 11: Backup and Restore (PDF)


The Solaris version of tar includes extra options. The -I option allows a list of files and directories that are backed up to be put into a text file. The -X option allows an exclusion file to be specified that lists the names of files and directories that should be skipped.

The Solaris version of mt supports an asf subcommand which moves the tape to the nth file. n being the number of the file.

Backups Under Solaris

File and Archiving Commands

The standard UNIX archiving utility. Originally a Tape ARchiving program, it has developed into a general purpose package that can handle all manner of archiving with all types of destination devices, ranging from tape drives to regular files to even stdout (see Example 4-3). GNU tar has long since been patched to accept gzip compression options, such as tar czvf archive-name.tar.gz *, which recursively archives and compresses all files (except dotfiles) in a directory tree.

Some useful tar options:

  1. -c create (a new archive)
  2. --delete delete (files from the archive)
  3. -r append (files to the archive)
  4. -t list (archive contents)
  5. -u update archive
  6. -x extract (files from the archive)
  7. -z gzip the archive
It may be difficult to recover data from a corrupted gzipped tar archive. When archiving important files, make multiple backups.
Shell archiving utility. The files in a shell archive are concatenated without compression, and the resultant archive is essentially a shell script, complete with #!/bin/sh header, and containing all the necessary unarchiving commands. Shar archives still show up in Internet newsgroups, but otherwise shar has been pretty well replaced by tar/gzip. The unshar command unpacks shar archives.
Creation and manipulation utility for archives, mainly used for binary object file libraries.
This specialized archiving copy command is rarely seen any more, having been supplanted by tar/gzip. It still has its uses, such as moving a directory tree.

Example 12-21. Using cpio to move a directory tree


# Copying a directory tree using cpio.


if [ $# -ne "$ARGS" ]
  echo "Usage: `basename $0` source destination"
  exit $E_BADARGS


find "$source" -depth | cpio -admvp "$destination"
# Read the man page to decipher these cpio options.

exit 0
Example 12-22. Unpacking an rpm archive
# Unpack an 'rpm' archive

TEMPFILE=$$.cpio                         # Tempfile with "unique" name.
                                         # $$ is process ID of script.

if [ -z "$1" ] 
  echo "Usage: `basename $0` filename"
exit $E_NO_ARGS

rpm2cpio < $1 > $TEMPFILE                # Converts rpm archive into cpio archive.
cpio --make-directories -F $TEMPFILE -i  # Unpacks cpio archive.
rm -f $TEMPFILE                          # Deletes cpio archive.

exit 0
The standard GNU/UNIX compression utility, replacing the inferior and proprietary compress. The corresponding decompression command is gunzip, which is the equivalent of gzip -d.

The zcat filter decompresses a gzipped file to stdout, as possible input to a pipe or redirection. This is, in effect, a cat command that works on compressed files (including files processed with the older compress utility). The zcat command is equivalent to gzip -dc.

Caution On some commercial UNIX systems, zcat is a synonym for uncompress -c, and will not work on gzipped files.

See also Example 7-6.

An alternate compression utility, usually more efficient than gzip, especially on large files. The corresponding decompression command is bunzip2.
compress, uncompress
This is an older, proprietary compression utility found in commercial UNIX distributions. The more efficient gzip has largely replaced it. Linux distributions generally include a compress workalike for compatibility, although gunzip can unarchive files treated with compress.
Tip The znew command transforms compressed files into gzipped ones.
Yet another compression utility, a filter that works only on sorted ASCII word lists. It uses the standard invocation syntax for a filter, sq < input-file > output-file. Fast, but not nearly as efficient as gzip. The corresponding uncompression filter is unsq, invoked like sq.
Tip The output of sq may be piped to gzip for further compression.
zip, unzip
Cross-platform file archiving and compression utility compatible with DOS PKZIP. "Zipped" archives seem to be a more acceptable medium of exchange on the Internet than "tarballs".

File Information
A utility for identifying file types. The command file file-name will return a file specification for file-name, such as ascii text or data. It references the magic numbers found in /usr/share/magic, /etc/magic, or /usr/lib/magic, depending on the Linux/UNIX distribution.

The -f option causes file to run in batch mode, to read from a designated file a list of filenames to analyze. The -z option, when used on a compressed target file, forces an attempt to analyze the uncompressed file type.

bash$ file test.tar.gz
test.tar.gz: gzip compressed data, deflated, last modified: Sun Sep 16 13:34:51 2001, os: Unix

bash file -z test.tar.gz
test.tar.gz: GNU tar archive (gzip compressed data, deflated, last modified: Sun Sep 16 13:34:51 2001, os: Unix)

Example 12-23. stripping comments from C program files
# Strips out the comments (/* COMMENT */) in a C program.


if [ $# -eq "$E_NOARGS" ]
  echo "Usage: `basename $0` C-program-file" >&2 # Error message to stderr.
  exit $E_ARGERROR

# Test for correct file type.
type=`eval file $1 | awk '{ print $2, $3, $4, $5 }'`
# "file $1" echoes file type...
# then awk removes the first field of this, the filename...
# then the result is fed into the variable "type".
correct_type="ASCII C program text"

if [ "$type" != "$correct_type" ]
  echo "This script works on C program files only."

# Rather cryptic sed script:
sed '
' $1
# Easy to understand if you take several hours to learn sed fundamentals.

# Need to add one more line to the sed script to deal with
# case where line of code has a comment following it on same line.
# This is left as a non-trivial exercise for the reader.

# Also, the above code deletes lines with a "*/" or "/*",
# not a desirable result.

exit 0

# ----------------------------------------------------------------
# Code below this line will not execute because of 'exit 0' above.

# Stephane Chazelas suggests the following alternative:

usage() {
  echo "Usage: `basename $0` C-program-file" >&2
  exit 1

WEIRD=`echo -n -e '\377'`   # or WEIRD=$'\377'
[[ $# -eq 1 ]] || usage
case `file "$1"` in
  *"C program text"*) sed -e "s%/\*%${WEIRD}%g;s%\*/%${WEIRD}%g" "$1" \
     | tr '\377\n' '\n\377' \
     | sed -ne 'p;n' \
     | tr -d '\n' | tr '\377' '\n';;
  *) usage;;

# This is still fooled by things like:
# printf("/*");
# or
# /*  /* buggy embedded comment */
# To handle all special cases (comments in strings, comments in string
# where there is a \", \\" ...) the only way is to write a C parser
# (lex or yacc perhaps?).

exit 0
which command-xxx gives the full path to "command-xxx". This is useful for finding out whether a particular command or utility is installed on the system.

$bash which rm


Similar to which, above, whereis command-xxx gives the full path to "command-xxx", but also to its manpage.

$bash whereis rm

rm: /bin/rm /usr/share/man/man1/rm.1.bz2

whatis filexxx looks up "filexxx" in the whatis database. This is useful for identifying system commands and important configuration files. Consider it a simplified man command.

$bash whatis whatis

whatis               (1)  - search the whatis database for complete words

Example 12-24. Exploring /usr/X11R6/bin

# What are all those mysterious binaries in /usr/X11R6/bin?

# Try also "/bin", "/usr/bin", "/usr/local/bin", etc.

for file in $DIRECTORY/*
  whatis `basename $file`   # Echoes info about the binary.

exit 0
# You may wish to redirect output of this script, like so:
# ./ >>whatis.db
# or view it a page at a time on stdout,
# ./ | less

See also Example 10-3.

Show a detailed directory listing. The effect is similar to ls -l.

This is one of the GNU fileutils.

bash$ vdir
total 10
 -rw-r--r--    1 bozo  bozo      4034 Jul 18 22:04 data1.xrolo
 -rw-r--r--    1 bozo  bozo      4602 May 25 13:58 data1.xrolo.bak
 -rw-r--r--    1 bozo  bozo       877 Dec 17  2000 employment.xrolo

bash ls -l
total 10
 -rw-r--r--    1 bozo  bozo      4034 Jul 18 22:04 data1.xrolo
 -rw-r--r--    1 bozo  bozo      4602 May 25 13:58 data1.xrolo.bak
 -rw-r--r--    1 bozo  bozo       877 Dec 17  2000 employment.xrolo

Securely erase a file by overwriting it multiple times with random bit patterns before deleting it. This command has the same effect as Example 12-31, but does it in a more thorough and elegant manner.

This is one of the GNU fileutils.

Caution Using shred on a file may not prevent recovery of some or all of its contents using advanced forensic technology.
locate, slocate
The locate command searches for files using a database stored for just that purpose. The slocate command is the secure version of locate (which may be aliased to slocate).

$bash locate hickson


Use the strings command to find printable strings in a binary or data file. It will list sequences of printable characters found in the target file. This might be handy for a quick 'n dirty examination of a core dump or for looking at an unknown graphic image file (strings image-file | more might show something like JFIF, which would identify the file as a jpeg graphic). In a script, you would probably parse the output of strings with grep or sed. See Example 10-7 and Example 10-8.

Strips the path information from a file name, printing only the file name. The construction basename $0 lets the script know its name, that is, the name it was invoked by. This can be used for "usage" messages if, for example a script is called with missing arguments:
echo "Usage: `basename $0` arg1 arg2 ... argn"

Strips the basename from a filename, printing only the path information.
Note basename and dirname can operate on any arbitrary string. The argument does not need to refer to an existing file, or even be a filename for that matter (see Example A-6).
Example 12-25. basename and dirname


echo "Basename of /home/bozo/daily-journal.txt = `basename $a`"
echo "Dirname of /home/bozo/daily-journal.txt = `dirname $a`"
echo "My own home is `basename ~/`."         # Also works with just ~.
echo "The home of my home is `dirname ~/`."  # Also works with just ~.

exit 0
Utility for splitting a file into smaller chunks. Usually used for splitting up large files in order to back them up on floppies or preparatory to e-mailing or uploading them.
sum, cksum, md5sum
These are utilities for generating checksums. A checksum is a number mathematically calculated from the contents of a file, for the purpose of checking its integrity. A script might refer to a list of checksums for security purposes, such as ensuring that the contents of key system files have not been altered or corrupted. The md5sum command is the most appropriate of these in security applications.

Encoding and Encryption
This utility encodes binary files into ASCII characters, making them suitable for transmission in the body of an e-mail message or in a newsgroup posting.
This reverses the encoding, decoding uuencoded files back into the original binaries.

Example 12-26. uudecoding encoded files


lines=35        # Allow 35 lines for the header (very generous).

for File in *   # Test all the files in the current working directory...
  search1=`head -$lines $File | grep begin | wc -w`
  search2=`tail -$lines $File | grep end | wc -w`
  #  Uuencoded files have a "begin" near the beginning,
  #+ and an "end" near the end.
  if [ "$search1" -gt 0 ]
    if [ "$search2" -gt 0 ]
      echo "uudecoding - $File -"
      uudecode $File

#  Note that running this script upon itself fools it
#+ into thinking it is a uuencoded file,
#+ because it contains both "begin" and "end".

# Exercise:
# Modify this script to check for a newsgroup header.

exit 0
Tip The fold -s command may be useful (possibly in a pipe) to process long uudecoded text messages downloaded from Usenet newsgroups.
At one time, this was the standard UNIX file encryption utility. [1] Politically motivated government regulations prohibiting the export of encryption software resulted in the disappearance of crypt from much of the UNIX world, and it is still missing from most Linux distributions. Fortunately, programmers have come up with a number of decent alternatives to it, among them the author's very own cruft (see Example A-4).

Utility for building and compiling binary packages. This can also be used for any set of operations that is triggered by incremental changes in source files.

The make command checks a Makefile, a list of file dependencies and operations to be carried out.

Special purpose file copying command, similar to cp, but capable of setting permissions and attributes of the copied files. This command seems tailormade for installing software packages, and as such it shows up frequently in Makefiles (in the make install : section). It could likewise find use in installation scripts.
more, less
Pagers that display a text file or stream to stdout, one screenful at a time. These may be used to filter the output of a script.


[1] This is a symmetric block cipher, used to encrypt files on a single system or local network, as opposed to the "public key" cipher class, of which pgp is a well-known example.


FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  


Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy


War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes


Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law


Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least

Copyright © 1996-2016 by Dr. Nikolai Bezroukov. was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case is down you can use the at


The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last updated: July 18, 2017