ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Essential System Administration, 3rd Edition

Top Five Open Source Packages for System Administrators

by Æleen Frisch, author of Essential System Administration, 3rd Edition
09/24/2002

Every system administrator can do their job easier by using the myriad of excellent open source utilities and packages available. This is true whether or not the operating systems on the computers you administer are themselves open source. This is the first installment of a five-part series. In each installment I'll discuss one package from my current list of the most useful and widely applicable administrative tools. This week we'll start the countdown with number five. Check back here in the coming weeks for the next four installments.

#5: Amanda

Amanda stands for Advanced Maryland Automated Network Disk Archiver. It was developed at the University of Maryland. James da Silva was the initial author. Amanda is a network-based enterprise backup utility that includes features previously available only in expensive commercial packages. Amanda is not the equal of the best commercial backup software, but it can be useful for a variety of computing environments.

Amanda takes advantage of native backup software, including dump, GNU tar, and Samba's smbtar utility (for backing up Windows clients). Amanda provides the infrastructure to deploy these tools effectively across a network of systems that need to be backed up. It also provides the record keeping and other information-management capabilities, which are necessary for the package to be easy to use.

As you'd expect, Amanda supports common tape drives and other backup devices (including stackers and jukeboxes). It can take advantage of hardware compression features or compress archives prior to writing them to other media when the former is not available. Software compression may be performed either on the client system, where the data to be backed up lives, or on the backup server.

Related Reading

Essential System Administration
Tools and Techniques for Linux and Unix Administration
By Æleen Frisch

Amanda can also perform full and incremental backups. In fact, Amanda will automatically select an incremental level based on its specified configuration parameters (more on this later).

Amanda has other nice features:

How Amanda Works

Amanda allows backups from a network of clients to be sent to a single, designated backup server. It initiates the local backup operations according to its scheduling and other parameters. The resulting archives are then saved to tape or other media. Amanda can also use holding disks as intermediate storage for backup archives in order to maximize tape-write performance. That also ensures that data is backed up in spite of tape errors by allowing the backup set to be written to backup media at a later time.

Amanda uses a combination of full and incremental backups to save all of the data for which it is responsible, using the smallest possible daily backup set. It uses the following information to do so. (For simplicity, we'll assume that one backup run is performed every day):

Amanda's overall strategy is twofold: to complete a full backup of the data within each cycle and to be sure that all changed data has been backed up between full dumps. The traditional method of doing this is to perform the full backup, say, once a week, followed by incremental backups on the other days of the week. Amanda operates differently.

During each run (at night), Amanda performs a full backup of part of the data. Specifically, the fraction that is required in order to back-up the entire data set in the course of a complete backup cycle. For example, if the cycle is seven days long, then one-seventh of the data is fully backed up each day. We could call this a "partial" full backup: a full backup of a specific part of the total data. In addition to this, Amanda also performs an incremental backup for all data that has changed since its own partial full backup.

The following figure will make this clearer:

This figure illustrates an example Amanda backup cycle lasting four days in which 15 percent of the data changes from day to day. The box at the top of the figure stands for the complete set of data for which Amanda is responsible; we have divided it into four segments to represent the part of the data that gets a full backup at the same time.

The contents of each nightly backup are shown at the bottom of the figure. The first three days represent a startup period. On the first night, the first quarter of the data is fully backed up (the purple block). On the second night, the second quarter (blue) is fully backed up, and the 15 percent of the data from the previous night that has changed during day two is also saved (purple).

On the third night, the third quarter of the total data is fully backed up (red), as well as:

By day four, the normal schedule is in force. Each night, one quarter of the total data is backed up in full, and incrementals are performed for each of the other quarters as appropriate to the time that has passed since their last full backup.

Brain Teaser:
What is the average percentage of the yellow section that is backed up in each day of the four-day cycle, assuming the 15 percent change is uniformly distributed?

Brain Teaser Answer

This example is still a simplification in that it uses first-level incremental backups. In actual practice, Amanda uses multiple levels of incremental backups to minimize backup storage requirements.

In order to restore files from an Amanda backup, you may need one complete cycle of media.

Configuring and Using Amanda

Configuring and using Amanda is not difficult. Client configuration is particularly simple:

  1. Create an Amanda user and group. Ensure that the data to be backed up is readable by this user and group. If you're using dump as the backup software, make sure that /etc/dumpdates is writeable by them as well.

  2. Install the client software.

  3. Add Amanda-related entries to /etc/services and /etc/inetd.conf. These defined the Amanda service (UDP port 10080), whose requests are handled by the amandad daemon.

  4. Select and configure an authentication scheme. The default is to use an .amandahosts file; it works similarly to an .rhosts file, but applies only to Amanda and so it carries significantly less associated risk.

There's a bit more involved in setting up the Amanda server system:

  1. Install the software, and create the Amanda user and group.

  2. Add Amanda-related entries to /etc/services and /etc/inetd.conf. The Amanda service is again defined, as well as the amandaidx and amidxtape services (corresponding to the admindexd and amidxtaped daemons, respectively).

  3. Set up the configuration files.

    The main file is amanda.conf. It specifies the following:

    Here is an example dump type definition:

    define dumptype    normal { 
       comment "Ordinary backup" 
       holdingdisk yes
       index yes
       program "DUMP"
       priority medium
       starttime 2000
       }

    This dump type uses a holding disk, creates an index for the backup set, contents for interactive restoration, and uses the dump program to perform the actual backup. It runs at medium priority compared to other backups. Amanda provides several pre-defined dump types in the example amanda.conf file, which can be used or customized as desired.

    The actual data to be backed up is defined in the disklist configuration file, using the generic dump types defined in amanda.conf. Here are two sample entries:

    # host file system dumptype spindle
    hamlet /chem stable -1
    ophelia /home normal -1

    The columns in this file hold the hostname, file system (specified by file within /dev, full special filename or mount point), the dump type, and a spindle parameter. The latter serves to control which backups can be done at the same time on a host. A value of -1 says to ignore this parameter. Other values define backup groups within a host; Amanda will only run backups from the same group in parallel.

  4. Prepare media for use with Amanda using the amlabel utility.

  5. Set up a cron job to run the amdump command on whatever schedule is appropriate for your site.

Amanda also provides many other utilities for validating its configuration files, for periodic clean up activities, and for performance tuning. It also provides a variety of useful activity logs and reports as well as the amrecover utility for restoring files from its backup sets.

More Information About Amanda

That's all for now. I'll be posting the next item in the list every week or so.

Number Four: LDAP
The countdown continues with LDAP, a protocol that supports a directory service.

Number Three: GRUB
The countdown continues with GRUB, the GRand Unified Bootloader.

Number Two: Nagios
The countdown continues with Nagios, a feature-rich network monitoring package.

Æleen Frisch has been a system administrator for over 20 years, tending a plethora of VMS, Unix, Macintosh, and Windows systems. If you liked this article and would like to receive the free ESA3 newsletter, you can sign up at http://www.aeleen.com/esa3_news.htm.


O'Reilly & Associates recently released (August 2002) Essential System Administration, 3rd Edition.


Return to the O'Reilly Network.

Copyright © 2009 O'Reilly Media, Inc.