ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Implementing Hardware RAID on FreeBSD

by Dan Langille
12/09/2004

RAID has been around for over 15 years. Why use RAID? For me, the reasons are redundancy and reliability. I don't like disk failures. By running RAID, a disk failure will not take down my system; it still runs after a disk fails. When a disk does fail, I still have my system, and I can find another drive, add it to the system, and be ready for the next failure.

Hardware fails. Disks fail. It is better to design the system around expected failure than it is to buy better disks.

I have two 3Ware 7006-2 cards and an Adaptec 2400A card. One of the 3Ware cards runs the Windows XP system, which I'm using as I type this article1. The other is in polo, my main FreeBSD box. My goal is to add the Adaptec 2400A to polo, create a RAID-5 array, and migrate the data from the two 80GB drives to four 80GB drives.

I have written previously about swapping boot disks. In the five years since then, I've learned a great deal more about FreeBSD. In this article, I will use a FreeSBIE CD to boot the system. Then I can copy from the original filesystem to the new filesystem. I still do it this way because I know that nothing will use the filesystem during this process, so I'll end up with an exact duplicate.

Just before I started this project, I upgraded all of the firmware and BIOS on the Adaptec card.

1The article actually resides on my FreeBSD box, which hosts my various development environments (FreeBSD Diary, FreshPorts, and BSDCan). I'm typing on my XP box and the FreeBSD box supplies a Samba share.

A Bit About FreeSBIE

I first heard of FreeSBIE in March 2004. It has since come to my rescue many times. I used it when my laptop hard disk died just before a conference; I could use the laptop when out of town thanks to FreeSBIE. I take the CD with me when considering hardware for purchase. Booting the FreeSBIE CD gives me a feel for what will and will not work.

FreeSBIE contains a wealth of applications, including Gimp, XFree86, Evolution, Gaim, Xmms esound, XFce, xcdroast, Samba, Python, mpg123, Midnight Commander, and DVD+RW applications.

I think you'll find it's a great little tool to add to your collection.

Installing the Hardware

I will be using four Seagate ST380011A drives. These are 80GB, 7200rpm, Ultra ATA/100 IDE drives. Yes, IDE. I'm doing RAID on IDE. At one time, that would have been unheard of; RAID was always SCSI. IDE performance and price have improved dramatically over the past few years. I do not know when the Adaptec 2400A first came out, but the documentation I have has a copyright date of 2001. I'd say things have come a long way. IDE RAID is perfectly feasible for many situations. Investigate. IDE might suit your needs, too.

Figure 1. XP Pro. Click for full-size image.
Figure 1. XP Pro. Click for full-size image.

It is not essential, but it is a good idea to have identical drives in your RAID array. All things being equal, you will have better results that way. I bought my drives from OEM Express, my not-so-local store.

The 2400A is a full-length PCI card. It won't fit into a small case. I would not recommend attempting to connect or disconnect the cables while you have the card installed. I fear the card might break.

Related Reading

The Complete FreeBSD
Documentation from the Source
By Greg Lehey

In my particular situation, I took a risk and set up the disks on a cardboard sheet on top of my case. I don't recommend doing this, but it worked for me. You'll have to decide what is best for you.

Creating and Building the Array

RAID comes in many forms. There are different types of RAID, some allowing for mirroring of disks, others allowing for striped disks. Pick the one that suits you best. For most applications, RAID-1 (mirroring) or RAID-5 (striped array with rotating parity) make the most sense. I already use RAID-1 on two machines, and I'm about to introduce RAID-5.

The 2400A comes with SMOR (Storage Manager on ROM), a BIOS-based setup utility that enables you to configure your Adaptec RAID controller. In short, you want to create a disk array, then allow it to build. The building process can take many hours. I then left it to complete overnight.

I had a problem with my controller setup. The controller saw only three of my four drives. I suspected the cable. After swapping two cables, the problem moved with the cable. I was going out anyway, so I bought new IDE cables. The problem persisted. OK, it might be the IDE controller. I feared the connector was the cause. Then someone mentioned the master/slave jumpers. Doh! I had removed the jumper when doing some earlier testing with one drive. After replacing the jumper, the controller detected all four drives. I now have four identical drives linked to a controller by four new identical cables.

After creating the array, I exited SMOR. The system then rebooted and went into FreeBSD, where I saw:

da0 at asr0 bus 0 target 0 lun0
da0: <ADAPTEC RAID-5 3A0L< Fixed Direct Access SCSI-2 device
da0: Tagged Queueing Enabled
da0: 228957MB (468903936 512 byte sectors: 255H 63S/T29187C)

There you go. You can even use the Raid Calculator to check the expected size. In my case, I expected 240GB. I suspect these drives actually hold only 76GB each after formatting, which makes 228GB, matching the value shown above.

At this point, I could hear disk chatter from the four drives. The RAID array was building. I've heard disk chatter before, but not from four drives simultaneously. It was unique. The building took at least four hours. By that time, it was only 85 percent complete. I went to bed and left it to run overnight. The next morning, the disk chatter had vanished.

Got Drive?

It was now the morning after. Now that I have my new drive (and yes, you should think of all four drives as one), what am I going to do with it? Partition it, slice it up, create mount points, and copy. Sounds easy. Sure it is. It's all documented in the FreeBSD Handbook. I'm also going to do some testing to make sure I know how to rebuild the array before I need to rebuild it.

If you do follow the instructions in the Handbook, I suggest not specifying the real mountpoints when using Disk Label Editor. /stand/sysinstall will attempt to mount the drive after labeling. That can be handy when installing, but it can Really Mess Things Up on a live system. Instead, specify /mnt, /mnt/usr, /mnt/var, etc instead of /, /usr, /var, etc. This should avoid any problems.

Warning: Take extreme care here. Do not work on the wrong drive. That, too, can really Really Mess Things Up.

You will need to partition the disk, label it, and then create a new filesystem by running newfs.

Here are my fdisk results:

Disk name:      da0                                    FDISK Partition Editor
DISK Geometry:  29187 cyls/255 heads/63 sectors = 468889155 sectors (228949MB)

Offset       Size(ST)        End     Name  PType       Desc  Subtype    Flags

         0         63         62        -      6     unused        0
        63  468889092  468889154    da0s1      3    freebsd      165    C
 468889155      14781  468903935        -      6     unused        0






The following commands are supported (in upper or lower case):

A = Use Entire Disk   G = set Drive Geometry   C = Create Slice   F = `DD' mode
D = Delete Slice      Z = Toggle Size Units    S = Set Bootable   | = Wizard m.
T = Change Type       U = Undo All Changes     W = Write Changes


Use F1 or ? to get more help, arrow keys to select.

Here is my disk label information:


                        FreeBSD Disklabel Editor

Disk: da0       Partition name: da0s1   Free: 0 blocks (0MB)

Part      Mount          Size Newfs   Part      Mount          Size Newfs
----      -----          ---- -----   ----      -----          ---- -----
da0s1a    <none>        500MB *
da0s1b    swap          750MB SWAP
da0s1e    <none>       1500MB *
da0s1f    <none>        750MB *
da0s1g    <none>      30000MB *
da0s1h    <none>      195449MB*





The following commands are valid here (upper or lower case):
C = Create        D = Delete   M = Mount pt.   W = Write
N = Newfs Opts    Q = Finish   S = Toggle SoftUpdates
T = Toggle Newfs  U = Undo     A = Auto Defaults    R = Delete+Merge

Use F1 or ? to get more help, arrow keys to select.

Testing Redundancy

Testing is more important than you think. By testing the array now, you know what to expect when a problem occurs. If you don't know what happens when things go wrong, how will you know how to react?

My testing plan involves populating the disk array, removing a drive from the array to simulate a failure, and then seeing what happens. Then I will recover from the "failure," and see what happens.

I started off by performing the mounts listed below:

mount /dev/da0s1a /mnt
mount /dev/da0s1e /mnt/var
mount /dev/da0s1f /mnt/tmp
mount /dev/da0s1g /mnt/usr
mount /dev/da0s1h /mnt/usr/home

I went into each directory and created a file. The result looks something like this:

# find /mnt
/mnt
/mnt/tmp
/mnt/tmp/ThisIsTmp
/mnt/usr
/mnt/usr/home
/mnt/usr/home/ThisIsUsrHome
/mnt/usr/ThisIsUsr
/mnt/var
/mnt/var/ThisIsVar
/mnt/ThisIsRoot

I shut down the system, disconnected the power from one drive, and restarted the system. All looked well. I had all of the files I had before. I could use the filesystem just as before.

I then added a file:

touch /mnt/AddedWithoutOneDrive

and shut down the box. When it was off, I resupplied power to the drive and booted up again. During the startup process, I went into SMOR and noted that it had marked the drive in question as degraded. I started the rebuild.

It was then that I realized I did not have to wait for it. The rebuild can occur on the fly, so to speak. I rebooted into FreeBSD and left the rebuild to run. Yes, the rebuild can occur in the background, allowing you to use the array. Should another drive fail during the rebuild, though, you will have a useless array.

Drive Order

One concern I had early on with multiple disks was whether the drive order mattered. If I need to remove the drives, do I have to remember to which cables they were connected? If I make a mistake, will I lose my data?

I'm happy to report that order does not matter, at least not with my testing on my Adaptec 2400A. Perhaps with other cards order does matter. I don't know. I swapped drives around from one channel to another and the controller knew what to do with them. I suspect the controller labels the drives in some manner in order to keep track of them.

Failure is Not an Option

The following point is very important.

1By RAID, I mean those RAID configurations that permit a disk to fail. Not all RAID configurations allow this.

As you noticed in the previous section, when I removed a disk from the array (simulating a disk failure), the filesystem continue to function. I could read files, add files, etc. However, if I had another disk failure, that would render my array useless and unrebuildable.

Therefore, it is very important that you detect and act up on failure as soon as possible. You could have a hot-swap or hot-standby option. I have chosen to go with a cold standby. I will keep a spare 80GB drive, same as all of the others, just to be safe. It will be available when the first drive fails. Yes, one will fail. It is just a matter of time.

Later in this article, I will discuss monitoring options and show you the script I use.

Populating the Array

In this section, I will show you how I duplicated everything from the existing filesystem into the new array. I'll use FreeSBIE and good old dd.

I let the array completely rebuild after the testing. Then I booted from the FreeSBIE CD. As root, I did the following:

  1. Created mount points for the old system.
  2. Created mount points for the new system.
  3. Mounted both drives.
  4. Copied everything from the old to the new.

I started off in my home directory, and started creating the mount points:

cd
mkdir population
cd population

mkdir OLD
mkdir OLD/var
mkdir OLD/tmp
mkdir OLD/usr
mkdir OLD/usr/home

mkdir NEW
mkdir NEW/var
mkdir NEW/tmp
mkdir NEW/usr
mkdir NEW/usr/home

Then I mounted the old filesystem read only to prevent any possibility of accidental destruction.

mount -r /dev/da1s1a OLD
mount -r /dev/da1s1e OLD/var
mount -r /dev/da1s1f OLD/tmp
mount -r /dev/da1s1g OLD/usr
mount -r /dev/da1s1h OLD/usr/home

Next, I mounted the new filesystem.

mount /dev/da0s1a NEW
mount /dev/da0s1e NEW/var
mount /dev/da0s1f NEW/tmp
mount /dev/da0s1g NEW/usr
mount /dev/da0s1h NEW/usr/home

Now I have the old filesystem mounted under OLD and the new filesystem mounted under NEW. This is good. Now it is just a matter of copying everything from one system to the other.

First we will copy the root directory:

cd ~/duplicate/OLD
tar cf - -C . --one-file-system . | tar xpvf - -C ~/duplicate/NEW

You will note that I am using a method suggested by Dean in response to my article "Swapping Boot Drives Around" from 1999.

Now I will run the same steps for each mount point, taking care to move into the correct directory (mount point) and to specify the correct destination:

cd ~/duplicate/OLD/var
tar cf - -C . --one-file-system . | tar xpvf - -C ~/duplicate/NEW/var

cd ~/duplicate/OLD/tmp
tar cf - -C . --one-file-system . | tar xpvf - -C ~/duplicate/NEW/tmp

cd ~/duplicate/OLD/usr
tar cf - -C . --one-file-system . | tar xpvf - -C ~/duplicate/NEW/usr

cd ~/duplicate/OLD/usr/home
tar cf - -C . --one-file-system . | tar xpvf - -C ~/duplicate/NEW/usr/home

Before you Boot, Adjust /etc/fstab

Before booting from the new array, modify NEW/etc/fstab so that the system mounts the new system, not the old system. Here are the entries from my system.

# for 3Ware
#/dev/twed0s1a          /               ufs     rw              1       1
#/dev/twed0s1b          none            swap    sw              0       0
#/dev/twed0s1e          /var            ufs     rw              2       2
#/dev/twed0s1f          /tmp            ufs     rw              2       2
#/dev/twed0s1g          /usr            ufs     rw              2       2
#/dev/twed0s1h          /usr/home       ufs     rw              2       2

# For Adaptec 2400A
/dev/da0s1a             /               ufs     rw              1       1
/dev/da0s1b             none            swap    sw              0       0
/dev/da0s1e             /var            ufs     rw              2       2
/dev/da0s1f             /tmp            ufs     rw              2       2
/dev/da0s1g             /usr            ufs     rw              2       2
/dev/da0s1h             /usr/home       ufs     rw              2       2

As you can see, I have copied and pasted the old entries, commented out the old entries, and altered the new entries to refer to the new device (da0) instead of the old device (twed0).

Before you Boot, Adjust Permissions on /tmp

After rebooting the first time, I found this problem:

postgres[238]: [1-1] FATAL: could not create lock file "/tmp/.s.PGSQL.5432.lock": 
     Permission denied

Ouch. Well, that is easy to solve. The permissions were incorrect. This is what /tmp should look like:

$ ls -ld /tmp
drwxrwxrwt 3 root wheel 512 Sep 8 12:38 /tmp

However, this is what I found:

$ ls -ld /tmp
drwxr-xr-x 3 root wheel 512 Sep 8 12:38 /tmp

You can prevent this problem from occurring with your first boot by correcting the situation with the following:

chmod go+w /tmp
chmod +t /tmp

Now you should be ready to boot from RAID device.

Booting

You should now be able to boot into your new system. Make sure you change the boot order in the BIOS so your RAID device is first.

After booting, I found this:

[dan@polo:~] $ df
Filesystem                1K-blocks    Used     Avail Capacity  Mounted on
/dev/da0s1a                  503966   46956    416694    10%    /
/dev/da0s1e                 1511934  390086   1000894    28%    /var
/dev/da0s1f                  755902    1162    694268     0%    /tmp
/dev/da0s1g                30241372 5497576  22324488    20%    /usr
/dev/da0s1h               197010560 9137806 172111910     5%    /usr/home
procfs                            4       4         0   100%    /proc

If you've been using FreeBSD for a while, then you'll know that the system automatically sends you a security report via email every night. After cloning a drive, you should be prepared for many new entries in this report. Here is one small extract from the report I received:

< 285 -r-sr-xr-x 1 root wheel
251444 Mar 29 17:14:04 2004 /bin/rcp
...
> 21409 -r-sr-xr-x 1 root wheel 251444 Mar 29 17:14:04 2004 /bin/rcp

The above information indicates that the inode number for the file has changed. This makes sense, as the files have moved to new locations. The first number on the line is the inode number. From man ls:


     -i      For each file, print the file's file serial number (inode num-
             ber).

With advance notice, at least this won't concern you as much as it did me when I first saw it.

Got RAID?

I've said it before and you'll hear it again. RAID will not solve all of your problems. It does remove some headaches. You must monitor it to achieve its full benefits. In my next article, I'll show you how I created a NetSaint plugin to monitor and report upon my RAID array. By using NetSaint and those scripts, you should have plenty of time to replace a dead drive before an array falls apart. That alone should save you hours of time.

Happy RAIDing.

Dan Langille runs a consulting group in Ottawa, Canada, and lives in a house ruled by felines.


Return to the BSD DevCenter.

Copyright © 2009 O'Reilly Media, Inc.