ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


FreeBSD Basics Backing up Files with Tar

by Dru Lavigne
05/23/2002

In my last article, I introduced the concept of archivers; today I would like to demonstrate the usage of the tar archiver.

Since we'll be backing up and restoring files, I recommend that you create a test user account to practice with until you are comfortable using the tar utility. On my system, I became the superuser and used the adduser command to create a test account named test:

su
Password:
adduser

I then followed the prompts to make a user called test.

I then wanted to quickly add a lot of subdirectories and files to this test user's home directory. Since I had the ports collection installed on my system, I copied over one of its subdirectories:

cp -r /usr/ports/www/ ~test/

I then changed the ownership of these files so they belonged to the test user:

chown -R test ~test/www/*

I now had a lot of files in a test directory to practice with. I then logged in as the test user and checked out the contents of my home directory:

ls -l
total 16
drwxr-xr-x  375 test  wheel  9728 May 11 09:53 www/

du -h | tail -2
 28M	./www
 28M	.

It looks like I have 28M worth of data to work with in my test directory.

In theory, tar can be as easy to use as this command:

tar c .

where the c means "create an archive" and the "." means "of the current directory." However, if you try this, you will probably get the same error message I did:

tar c .
tar: can't open /dev/sa0 : Permission denied

Aha, you may think; I'll try as the superuser:

su
Password:
tar c .
tar: can't open /dev/sa0 : Device not configured

Remember last week when I talked about tape devices? By default, the tar utility assumes that you want to backup to your first SCSI tape drive (/dev/sa0) which is great, if you happen to have one attached to your PC. If you don't, all is not lost. In Unix, a tape device is simply a file. So it is very easy to tell tar to create a backup to another file, whether that file be a different type of tape device, a floppy, another hard drive, another PC on the network, or an actual file somewhere on your system.

I'll start simple, by telling tar to create (c) a backup of my current directory (.) to a file I'll call backup.tar. Since this is not the default backup location, I'll use the f switch to indicate the name of the file I'd like the backup sent to:

tar cf backup.tar .

When I ran this command, my prompt disappeared for a moment and I heard my hard drive churning away. When my prompt reappeared, I had a new file in my home directory named backup.tar. If you don't want to just wait in silent anticipation, use the v switch and tar will tell you what it is doing while it is doing it. I'll remove that backup and try again with the v switch:

rm backup.tar
tar cvf backup.tar .

You'll understand the difference when you try this for yourself. Now, let's see what type of file tar created:

file backup.tar
backup.tar: GNU tar archive

This is not an ASCII text file, so I won't be able to view its contents with a pager or an editor. However, tar understands this file and I can ask it to read it for me using the t switch:

tar t backup.tar
tar: can't open /dev/sa0 : Device not configured

Oops, I forgot that tar expects to read that SCSI tape device unless I tell it to look somewhere else. I'll try again, this time including the f switch:

tar tf backup.tar

This time, a whole bunch of files and directories fly by very quickly; it looks like I've successfully made a backup. If I wanted to verify the file list, I'd send the output to a pager so I could read it one page at a time:

tar tf backup.tar |more
Learning the Unix Operating System

Related Reading

Learning the Unix Operating System
A Concise Guide for the New User
By Jerry Peek, Grace Todino-Gonguet, John Strang

It is also possible to create a compressed backup by including either of the z or Z switches when using tar. Let's take a look at the size of that backup we just created:

ls -l backup.tar
-rw-r--r--  1 root  wheel  25722880 May 11 16:41 backup.tar

I'll now remove that backup, tell tar to create a compressed backup using the gzip utility, then view the difference in size and type:

rm backup.tar
tar cvzf backup.tar.gz .

ls backup.tar.gz
-rw-r--r--  1 root  wheel  5899840 May 11 16:45 backup.tar.gz

file backup.tar.gz
backup.tar.gz: gzip compressed data, deflated, last modified: Sat May 11 
16:45:47 2002, os: Unix

And I'll repeat the above, except this time tell tar to compress using the compress utility instead:

rm backup.tar.gz
tar cvZf backup.tar.Z

ls backup.tar.Z
-rw-r--r--  1 root  wheel  9444468 May 11 16:50 backup.tar.Z

file backup.tar.Z
backup.tar.Z:   compress'd data 16 bits

To list the files in a compressed archive, don't forget to include the z (or Z) switch. For example, if I try this:

tar tf backup.tar.Z

I'll get this strange error:

tar: Hmm, this doesn't look like a tar archive.
tar: Skipping to next file header...
cZ\333\300\021\207\335v\333J\235\212\335H\335<\270\377\203\025\323}\333\220\016
\215\335h*d\335?\320\223\333\225\335\206\333\224\335\234\020\007\334\324]m\312D
s.\017\214\256\374\251H\320\016\252\031\332YE\316\304\360\301\003\242\362\301\2
35\327\241\260\261\030\377\t3\256S\320H\t\327\270\204\302\246\335\030\207/\242(
\251
tar: Skipping to next file header...
tar: only read 3188 bytes from archive backup.tar.Z

Since this file was created with the Z switch, I have to remember to include the Z switch whenever I work with this file.

tar tZf backup.tar.Z

The above command will give me the listing of the contents. You'll note that when I created my backups, I gave the archives I created with the z switch the extension of tar.gz and the files I created with the Z switch the extension of tar.Z. I can call my archive whatever I want; I just used that convention to remind me that I'm dealing with a tar archive file and what type of compression I used when I created that file. It is always a good idea to use the file utility on an archive to verify whether or not it has been compressed, and if so, whether it was compressed with the z or the Z switch.

Also, when listing the contents of an archive, you can include the v, or verbose, switch. Here is an example of the difference in the output, first without the v switch:

tar tzf backup.tar.gz | tail -2
www/mod_tsunami/pkg-plist
www/Makefile

Then with the v switch:

tar tzvf backup.tar.gz | tail -2
-rw-r--r-- test/wheel      116 May 11 09:53 2002 www/mod_tsunami/pkg-plist
-rw-r--r-- test/wheel     9713 May 11 09:53 2002 www/Makefile

Notice that long listing of the files in the backup and compare that to a long listing of the backup file itself:

ls -l backup.tar.gz
-rw-r--r--  1 root  wheel  15098016 May 11 17:31 backup.tar.gz

The backup was created by the superuser at 17:31, yet the files in the backup still belong to the test user and those files were created at 9:53 (when I set up the test directory). This is what I was talking about in the last article when I said that archivers preserve the permissions and ownership of the files that are backed up.

It is also possible to tell tar to backup multiple directories:

tar cvzf partial.tar.gz www/apache2 www/chimera www/zope

The above command will create (c) a gzipped (z) file (f) named partial.tar.gz by archiving the contents of the directories apache2, chimera, and zope. Remember, tell tar which file you want to send the backup to first; everything after that name will be what tar will back up for you.

If you are still saving up for a tape device and want to do a poor man's backup using floppies, this command will backup everthing in your current directory:

tar cvMf /dev/fd0 .

Don't forget to put a floppy disc in your floppy drive, and make sure you are in the directory you want to backup. Note that instead of using the f switch to give the name of a backup, I used it to specify the name of my floppy device (/dev/fd0).

Why did I also include the M, or multi-volume, switch? Since I was backing up a directory with 28M of data, I knew it wouldn't fit all one floppy. Since I used the M switch, when tar had filled up the first floppy, it displayed this message:

Prepare volume #2 for /dev/fd0 and hit return:

It is a good idea to always include the M switch when backing up to floppies, just in case the data won't fit on one floppy. Also, if you want to save on the number of floppies you'll need, include either of the z or the Z switches so that the data will also be compressed as it is backed up.

An interesting note about backing up to floppies: you don't mount the floppy first, and once it contains the backup, you won't be able to mount the floppy. Yet the tar utility understands the data on the floppy, so this command will work:

tar tvf /dev/fd0

You should run this command after you create your backup, so you can verify the files in the backup.

However, if I try this command:

mount /dev/fd0 /floppy

I'll receive this error:

mount: /dev/fd0 on /floppy: incorrect super block

One last thing about creating archives with tar: tar was designed to back up everything in the specified directory. This means that every single file and subdirectory that exists beneath the specified directory will be backed up. It is possible to specify which files you don't want backed up using the X switch.

Let's say I want to backup everything in the www directory except for the apache2 and zope subdirectories. In order to use the X switch, I have to create a file containing the names of the files I wish to exclude. I've found that if you try to create this file using a text editor, it doesn't always work. However, If you create the file using echo, it does. So I'll make a file called exclude:

echo apache2 > exclude
echo zope >> exclude

Here, I used the echo command to redirect (>) the word apache2 to a new file called exclude. I then asked it to append (>>) the word zope to that same file. If I had forgotten to use two >'s, I would have overwritten the word apache2 with the word zope.

Now that I have a file to use with the X switch, I can make that backup:

tar cvfX backup.tar exclude www

This is the first backup I've demonstrated where the order of the switches is important. I need to tell tar that the f switch belongs with the word backup.tar and the X switch belongs with the word exclude. So if I decide to place the f switch before the X switch, I need to have the word backup.tar before the word exclude. This command will also work as the right switch is still associated with the right word:

tar cvXf exclude backup.tar www

But this command would not work the way I want it to:

tar cvfX exclude backup.tar www
tar: can't open backup.tar : No such file or directory

Here you'll note that the X switch told tar to look for a file called backup.tar to tell it which files to exclude, which isn't what I meant to tell tar.

Let's return to the command that did work. To test that it didn't back up the file called apache2, I used grep to sort through tar's listing:

tar tf backup.tar | grep apache2

Since I just received my prompt back, I know my exclude file worked. It is interesting to note that since apache2 was really a subdirectory of www, all of the files in the apache2 subdirectory were also excluded from the backup. I then tested to see if the zope subdirectory was also excluded in the backup:

tar tf backup.tar | grep zope
www/zope-zpt/
www/zope-zpt/Makefile
www/zope-zpt/distinfo
www/zope-zpt/pkg-comment
<output snipped>

This time I got some information back, as there were other subdirectories that started with the term "zope," but the subdirectory that was just called zope was excluded from the backup.

Now that we know how to make backups, let's see how we can restore data from a backup. Remember from last week the difference between a relative and an absolute pathname, as this has an impact when you are restoring data. Relative pathnames are considered a good thing in a backup. Fortunately, the tar utility that comes with your FreeBSD system strips the leading slash, so it will always use a relative pathname -- unless you specifically overrride this default by using the P switch.

It's always a good idea to do a listing of the data in an archive before you try to restore it, especially if you receive a tar archive from someone else. You want to make sure that the listed files do not begin with "/", as that indicates an absolute pathname. I'll check the first few lines in my backup:

tar tf backup.tar | head
www/
www/mod_trigger/
www/mod_trigger/Makefile
www/mod_trigger/distinfo
www/mod_trigger/pkg-comment
www/mod_trigger/pkg-descr
www/mod_trigger/pkg-plist
www/Mosaic/
www/Mosaic/files/
www/Mosaic/files/patch-ai

None of these files begin with a "/", so I'll be able to restore this backup anywhere I would like. I'll practice a restore by making a directory I'll call testing, and then I'll restore the entire backup to that directory:

mkdir testing
cd testing
tar xvf ~test/backup.tar 

You'll note that I cd'ed into the directory to contain the restored files, then told tar to restore or extract the entire backup.tar file using the x switch. Once the restore was complete, I did a listing of the testing directory:

ls
www

I then did a listing of that new www directory and saw that I had successfully restored the entire www directory structure, including all of its subdirectories and files.

It's also possible to just restore a specific file from the archive. Let's say I only need to restore one file from the www/chimera directory. First, I'll need to know the name of the file, so I'll get a listing from tar and use grep to search for the files in the chimera subdirectory:

tar tf backup.tar | grep chimera
www/chimera/
www/chimera/files/
www/chimera/files/patch-aa
www/chimera/scripts/
www/chimera/scripts/configure
www/chimera/pkg-comment
www/chimera/Makefile
<snip>

I'd like to just restore the file www/chimera/Makefile, and I'd like to restore it to the home directory of the user named genisis. First, I'll cd to the directory to which I want that file restored, and then I'll tell tar just to restore that one file:

cd ~genisis
tar xvf ~test/backup.tar www/chimera/Makefile

You'll note some interesting things if you try this at home. When I did a listing of genisis' home directory, I didn't see a file called Makefile, but I did see a directory called www. This directory contained a subdirectory called chimera, which contained a file called Makefile. Remember, when you make an archive, you are including a directory structure, and when you restore from an archive, you recreate that directory structure.

You'll also note that the original ownership, permissions, and file creation time were also restored with that file:

ls -l ~genisis/www/chimera/Makefile
-rw-r--r--  1 test  wheel  406 May 11 09:52 www/chimera/Makefile

That should get you started with using the tar utility. In next week's article, I'll continue with some of the interesting options that can be used with tar, and then I'll introduce the cpio archiver.

Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.


Read more FreeBSD Basics columns.

Return to the BSD DevCenter.


Copyright © 2009 O'Reilly Media, Inc.