Who Has Which FilesOne morning when there just didn't seem to be enough caffeine in the world, I decided to avoid dealing with people and just clean up all the little jobs on my to-do list. One task was to put a CD-ROM into a particular machine and copy files from it to the hard drive. Seems simple enough, doesn't it? Except:
# umount /cdrom/
umount: unmount of /cdrom
failed: Device busy
#
This error shows up when a CD-ROM is mounted and in use, and you try to unmount it. But I'm sitting right in front of the system, and the CD-ROM light isn't on, and the motor isn't humming. The CD-ROM might be mounted, but it certainly isn't actually in use.
At this point, I have a few options. I could reboot the machine,
annoying all the users. While annoying users can be fun, it can also
generate a lot of work. I could run around asking everyone if they
are using the CD-ROM, but that would mean I'd have to stir myself out
of the comfy chair and actually speak to people I'd rather not talk
to. I could forcibly unmount it, but I have no idea how badly that
would affect the person who mounted it. Or, I could try to figure out
why the system thinks the CD-ROM is busy, and just approach the one
person responsible. Since that involves the least contact with human
beings, I chose that route. You can learn who is using what files
with fstat(1).
According to the man page fstat(1) "identifies active files." This
might not seem like much, but in UNIX everything is a file. While
more recent operating systems (such as Plan 9) implement this idea to
its logical extreme, even pipes and network connections are largely
treated as files. If you can examine files that are in use, you can
see just about everything that happens on the system. fstat(1) takes
a snapshot of the system at a particular moment. As programs
continuously open and close files, pipes, and network connections, the
output of fstat changes from second to second.
If you go to a command prompt and type "fstat," you'll see a list of
all the active files on the system. This list can be very long, as
each process probably has several files open. My laptop, running an
assortment of desktop programs, has about 400 open files. A friend's
small Web server runs about 9,000 open files, while some heavily used
Web servers have about 30,000 files open. To make things more
interesting, programs are continually opening and closing files, so
this number changes constantly. fstat(1) makes a snapshot of the open
files, so if you run it several times in quick succession you will get
different results.
Here's a snippet of fstat output from my laptop:
....
mwlucas ssh 2820 3* internet stream tcp c2ef2814
mwlucas rxvt 2819 root / 2 drwxr-xr-x 512 r
mwlucas rxvt 2819 wd /usr 846337 drwxr-xr-x 2560 r
mwlucas rxvt 2819 text /usr 802549 -rws--x--x 89092 r
mwlucas rxvt 2819 2 /dev 60 crw------- ttyv0 rw
mwlucas rxvt 2819 3* local stream c2ebdbd0 <-> c2ebd870
mwlucas rxvt 2819 4 /dev 104 crw-rw-rw- ptyp0 rw
mwlucas mozilla-bin 2725 root / 2 drwxr-xr-x 512 r
mwlucas mozilla-bin 2725 wd /usr 808118 drwxr-xr-x 1536 r
....
So, this looks like a lot of information. What does it mean?
The first column is the username that has the file open. The second is the name of the program that has the file open. While program names aren't that useful, the third column gives the PID of the process.
|
Related Reading
Unix Power Tools |
The fourth column is where things get interesting. This could contain
a number, a number marked with an asterisk, or a keyword. A plain
number is the process-internal file descriptor. When a process opens
a file, it assigns that file a number so it can keep track of it.
fstat(1) lines that have a number in the fourth field represent plain
text or data files that the program is reading or writing to.
If the fourth field is a number with an asterisk (such as the first entry in our sample output above), the line represents an open socket. These can be UNIX domain sockets, network sockets, or pipes. If the line represents a socket, the rest of the line has a varying format depending on which sort of socket it is. We aren't going to worry about sockets right now. fstat isn't that useful for investigating open network connections on FreeBSD, but on OpenBSD fstat gives the IP address and port number of an open connection. Other operating systems vary; check your preferred UNIX and see what yours does.
If the fourth field is the word "text," that does not mean that the file is a text file. Instead, it means that this is "executable text" or a program. (Only computer scientists would ever think that text means computer text.) This indicates that the process has an executable program open.
|
A "wd" means that this process has a working directory. A working
directory is a directory where commands are run from. You could have
a command prompt sitting idle in a directory, and that directory would
be open by the shell that has that command prompt.
The fourth field has other possible values, but these are by far the most common.
The fifth field gives the mount point where the file resides, and the sixth is the inode of the file. We can use these to find the actual file later.
Then we have the mode of the open file. These appear as standard UNIX filesystem permissions.
The seventh field varies depending on whethere the file in question is a regular file or a device node. If it's a regular file, the seventh field contains the size of the file in blocks. If it's a device node, this field contains the device name.
Finally, we have the read/write status of this file. If the file is
open for reading, you'll see an "r." If it's open for writing, you'll
see a "w." An "rw", as you might guess, shows that the file is
available for reading and writing.
This is pretty powerful, but how can you possibly use it? You
cannot sort through 400 lines of output, let alone 30,000 lines.
You could filter the output with grep(1), but you may not know exactly
what you're looking for. fstat(1) includes three powerful filtering
flags. You can use any one of them at a time.
-f filters by mount point. If you're interested in files open in the
filesystem where /usr/home/mwlucas lives, you can set that with -f.
Note that fstat will not restrict its search to one directory by doing
this, but it will automatically figure out which partition that home
directory lives on. "fstat -f /usr/home/mwlucas" will give us all the
open files on /usr, which is where my home directory is mounted.
-u filters by username. I could run "fstat -u mwlucas" to see all the
files I have open. Or I could use the -p flag and specify a process
ID to see only files opened by that process.
So let's go back to my original problem. I have a CD-ROM drive that
claims to be busy. What's accessing it? My CD-ROM drive is mounted on
/cdrom, so I use fstat's -f flag.
# fstat -f /cdrom
chris tcsh 2834 wd /cdrom 141312 dr-xr-xr-x 6144 r
#
There is one file open on this disk. That's enough to keep me
from unmounting the disk. It's open by user "chris". The interesting
thing is the fourth column, with the "wd" entry. This means that the
open file is a directory, or a command prompt of some sort sitting on
this filesystem. And there is no other activity on the filesystem.
Remember, fstat provides a snapshot of the system's file activity. I
can run this command several times in quick succession to see if
I just caught Chris at an idle moment. If the disk is accessed, fstat
will show other files open. In this particular case, I could also
look at the light on the front of the system to see if it's being used
at all. But that's far too easy. In any event, the outcome is the
same. I pick up the phone, call Chris, and tell him to get his danged
idle command prompt out of /cdrom if he's not going to use it.
fstat might have shown extensive work being done on that CD-ROM. If
that was the case, Chris' username would have shown up with multiple
open files in that directory. I could coordinate with him to get both
our jobs done in a timely manner.
If Chris is simply not available--say he's gone home for the night
and left his xterm locked--I can just kill his command prompt. He
might be annoyed in the morning, but that's OK. Depending on your
work environment, you might not want to kill another user's shell. I
feel perfectly comfortable killing anything of Chris', so I'll
proceed. The third field is the process ID of the shell session
somewhere under /cdrom.
# kill -1 2834
# umount /cdrom
#
I could also forcibly unmount the CD-ROM using umount -f. UNIX
provides many different ways to solve this issue. Pick your favorite.
One common annoyance with fstat is that you get the inode of the file
being accessed, not the file name. That's not really a problem. You
can find file name of the inode with find(1). You can also use the -x
argument to find to restrict your search to a single mount point.
For example, my laptop runs cvsupd. If I want to know where this program writes its log files, I can shuffle through the startup scripts, configuration files, and man pages to find it. Or, I can simply look to see which files it has open.
# ps -ax | grep cvsupd
199 ?? Is 0:00.00 cvsupd -e -C 100 -l @daemon -b /usr/local/etc/cvsup -
#
So, cvsupd has PID 199.
# fstat -p 199
cvsup cvsupd 199 root / 2 drwxr-xr-x 512 r
cvsup cvsupd 199 wd /var 40 drwxrwxrwt 512 r
cvsup cvsupd 199 text /usr 1541084 -rwxr-xr-x 891596 r
cvsup cvsupd 199 0 /dev 10 crw-rw-rw- null rw
cvsup cvsupd 199 1 /var 1759 -rw-rw-r-- 0 w
cvsup cvsupd 199 2 /var 1759 -rw-rw-r-- 0 w
cvsup cvsupd 199 3* internet stream tcp c2ef1100
cvsup cvsupd 199 4* pipe c2e07000 <-> c2e06f20 0 rw
cvsup cvsupd 199 5* pipe c2e06f20 <-> c2e07000 0 rw
cvsup cvsupd 199 6* local dgram c2ebde10 <-> c2ebe000
#
We're looking for a log file, which will be identified by a number in
the fourth column. The third, fourth, and fifth lines are all files.
Looking at only these three lines, check the fifth field. The third
line is a device (under /dev), so we aren't interested in that entry.
That leaves us with the fourth and fifth lines, which both have
something open under /var. The sixth field is the inode of the open
file. For both lines, this is inode 1759.
# find -x /var -inum 1759
/var/tmp/cvsupd.out
#
I never would have guessed to look there. Fortunately, with fstat,
you don't have to guess.
Read more Big Scary Daemons columns.
Return to the BSD DevCenter.
Copyright © 2009 O'Reilly Media, Inc.