Keeping Your Life in Subversionby Joey Hess
I keep my life in a Subversion repository. For the past five years, I've checked every file I've created and worked on, every email I've sent or received, and every config file I've tweaked into revision control. Five years ago, when I started doing this using CVS, people thought I was nuts to use revision control in this way. Today it's still not a common practice, but thanks to my earlier article "CVS homedir" (Linux Journal, issue 101), I know I'm not alone. In this article I will describe how my new home directory setup is working now that I've switched from CVS to Subversion.
Subversion is a revision-control system. Like the earlier and much cruftier CVS, its purpose is to manage chunks of code, such as free software programs with multiple developers, or in-house software projects involving several employees. Unlike CVS, Subversion handles directories and file renaming reasonably, which is more than sufficient reason to switch to it if you're already using CVS. It also fixes most of CVS's other misfeatures. Subversion still has its warts, though, such as an inability to store symbolic links and some file permissions, and its need for twice as much disk space as you'd expect thanks to the copies of everything in those .svn directories. These problems can be quite annoying when you're keeping your whole home directory in svn. Why bother?
Benefits of Revision-Controlled Directories
I see three main benefits of keeping my entire home directory in svn:
- home directory replication
- distributed backups
The first reason originally drove me to using revision control for my
whole home directory. It's still the greatest benefit today. I have many
accounts on Unix machines scattered around my house, the country, and the
planet, and I have an abiding desire for every single one of these disparate
accounts to work and look exactly the same. I don't care if the machine I'm
logging in to is in Japan or the Netherlands, or a California colocation
center, or my home office; I don't care if it's a PC clone, or a Mac, or an
S/390 virtual machine; if it's not set up the same as all the others, if I
cannot concentrate on the important differences instead of being distracted by
the unimportant ones, then I will be less productive. The final
ingredient for configuration insanity is that I constantly tweak my setup. As
soon as I make an improvement, I want it to be available on every one of my
accounts, everywhere. Without Subversion, keeping all these accounts in sync
would be well-nigh impossible. With Subversion it's as easy as typing
up now and then.
It seems that the next big change in how we use computers might be the
introduction of filesystems that store every old version of every file. With
the explosion in size of cheap hard disks, there seems to be no reason not to
keep a complete record of your computing life--and several research projects
are working on it. Meanwhile, I've done just that for five years, using first
CVS and now svn. It's amazing to check out my home directory as it looked on
New Year's Day 1999 and play around in it. It's neat to be able to look at the
entire revision history of my .procmailrc and watch as I moved mail
around, dealt with a growing spam problem, and joined and left many mailing
lists. It's handy to be able to run
svn diff on my kernel config
file to see how
make xconfig changed it. I can recover files that
I've deleted, or delete files because they're not relevant right now, and know
I've not really lost them at all. Amazingly, my Subversion repository is only 4GB in size even with all this historical data.
I have not lost a file since 1999, and I don't intend to ever again. Take
one crucial file, like my resume or sent-mail archive. I have a copy of that
file on my desktop computer in the
.svn directory. There's another
copy on my home directory on my laptop, and yet another copy in the Subversion
repository on my server thousands of miles away. People tell me that the best
backups take no effort--so you actually do them--and are widely scattered
among many machines and a lot of area so a local disaster won't knock them out; additionally, they are tested on a regular basis to make sure the backup works. I'm
doing all of these things, as a mere side effect of keeping it all in
Subversion. To complete the picture, I only have to take very careful backups
of my Subversion repository itself. The automated distributed backups via svn
keep me sleeping quietly at night. I know that no matter what I do, my life
will still be there, safe and secure in svn.
At this point I should fess up to my dirty little secret: not everything is
in svn after all. My full home directory with all the trimmings often runs to
dozens of gigabytes. Much of that is collections of music files and
documentation, which I have not yet dared to check into svn and which I rsync
between computers. As disk sizes continue to grow, it looks more and more
likely that I will take the plunge soon and check these large file libraries
into svn too. Then too I have the occasional file, such as a disk image for a
virtual machine, that is too large and too much bother to check into svn. I don't
keep incoming mailboxes in svn, because that would lead to a merging nightmare; instead I use
offlineimap to synchronize them between several
cron job does check in the mail archives. A few other missing
corners are my web browser cache, which I would love to have a history of,
and my temporary directory, which I'd rather not.
I have made some progress recently in moving more into svn. I've
managed to check the /etc directories of several machines into svn.
While this is of questionable value as a way to replicate those machines, and
it doesn't include some files such as /etc/shadow, it's useful to be
able to check old versions of config files. I've also come up with a way to
check crontabs into svn. This is a great improvement, because I can edit and
view any machine's
cron jobs from anywhere and have all the history and backup
benefits of svn. I'm sure that my use of svn will only increase as I find ways
to use it in the odd little corners that remain. Yesterday I even found myself
checking baby photos into svn for my family's web site.
I speak of my svn repository, but I actually have several repositories.
First, the public one holds most of the nonprivate parts of my home directory
and lots of software projects. You can even browse the contents of this
directory on the Web at
svn.kitenet.net, or check it out
svn://svn.kitenet.net/joey/. Next I have a
private repository that holds such things as my email archives, and I have
several other small, special-purpose repositories. I also work on other projects
themselves kept in svn on other servers. A full checkout of my home directory
will include parts from all of these repositories; the
svn:externals feature of svn lets me knit them all together into a
whole that I can check out or update with a single command.
I've always managed my home directory with an iron hand. Keeping files in revision control has only exacerbated this tendency. Let's look at the top level:
joey@dragon:~> ls Maildir/ bin/ doc/ html/ lib/ mail/ src/ tmp/
That really is everything, except for 100-plus dot files. Most people use their home directory as a cluttered scratch space for files they're working on. Subversion works better for this than CVS does, because it lets you easily rename and move files and directories. My tightly controlled home directory is partly personal preference and partly a leftover from my days as a CVS user. Keeping a home directory in Subversion does encourage some neatness, as svn will complain about files that it does not know about. This encourages keeping things organized, or at least out of the way in a temporary directory.
Because my home directory is publicly available online, I have to take care to keep private files private. One tricky area is private dot files. These need to be in my home directory, but I can't keep them in the public repository. To manage this, I keep all the private dot files in ~/.hide, which I store in an entirely different, private Subversion repository.
To make things work, there are symlinks from the private dot files in ~/.hide to my home directory. For this I have a svnfix program that symlinks them into my home directory, fixes some permissions and other symlinks, and even updates my crontab from svn. I have to remember to run this program from time to time, or put a call to it in my crontab, because there is no way to add a client-side hook in Subversion, or CVS for that matter.
My ~/.hide directory is just one of several Subversion repositories
that Subversion's useful
svn:externals feature pulls into my home
directory. My ~/src subdirectory, which holds various code projects
I'm working on, is an even better example, as some of its contents come from
repositories shared with others.
joey@dragon:~> ls src Words2Nums/ debconf/ filters/ packages/ sleepd/ alien/ debhelper/ flashybrid/ pdmenu/ tasksel/ apt-src/ debian-cd/ kernel/ sarge/ ticker/ base-config/ debian-edu/ misc/ secure-testing/ unreleased/ d-i/ dpkg-repack/ mooix/ skolelinux/ wmbattery/ joey@dragon:~> svn propget svn:externals src mooix svn+ssh://svn.mooix.net/home/svn/mooix/trunk debhelper svn+ssh://kitenet.net/home/svn/debhelper/trunk tasksel svn+ssh://svn.debian.org/svn/tasksel/trunk d-i svn+ssh://svn.debian.org/svn/d-i/trunk base-config svn+ssh://svn.debian.org/svn/base-config/trunk debconf svn+ssh://svn.debian.org/svn/debconf/trunk/src/debconf secure-testing svn+ssh://svn.debian.org/svn/secure-testing
After I use the
svn propedit command to add external
repositories, svn pulls them in and they become subdirectories in my home
directory. They mostly behave as if they were part of the same larger
repository. This is a great feature, with uses beyond including directories
from other repositories. On many of the machines I use, I don't need my entire
home directory checkout, and so my home directory is more minimal.
joey@elephant:~> ls bin/ tmp/
I use this machine for occasional development. I don't fully trust the
machine, so I don't want to put private files there. I have a branch of my
home directory that (using
svn:externals) includes only the basics
and is perfectly usable for everything I normally do on that machine. Using
svn:externals like this to pull in optional directories keeps the
part of my home directory that I have to branch (and merge) small.
When I want to check out my home directory to a new account, I run one of these commands:
% svn co svn+ssh://email@example.com/svn/joey/trunk/home-base . % svn co svn+ssh://firstname.lastname@example.org/svn/joey/trunk/home-full .
The first is the minimal version of my home directory; the other is the whole thing. The dots at the end of the command lines make svn check it out directly into my home directory.
I switched from CVS to svn over a painful couple of months in the winter of 2003. CVS had many misfeatures that made keeping a home directory in it annoying, and I'm glad that I don't have to worry anymore about picking file and directory names (svn can easily rename them; CVS couldn't); that svn can handle binary files well and efficiently (unlike CVS); that svn is quite a bit faster at updating large home directories than CVS is; that managing branches is so much easier with svn that I actually have some branches of my home directory; and that those annoying "CVS" directories that once cluttered up every corner of my home directory have gone away. The transition from CVS to svn would be easier today, as the conversion software has improved, but such a large conversion between revision-control systems is bound to be a slow and painstaking process.
It's interesting to consider that the longevity of my home directory's history has no ties to the useful lifetime of a given revision-control system, or even the lifetime a given computer platform. Converting repositories of past revision-control systems seems likely to be something new systems will continue to support. If one day I switch to arch, or a distant future relative of arch and svn, I fully expect to take my history with me when I do.
Before I go, I want to thank the hundreds of readers who responded to my original article about keeping my life in CVS. Thanks for your encouragement, for your ideas, and for letting me know I wasn't as crazy as I thought. Yes, as you can see, I finally have switched to Subversion! Now I'm off to commit this file....
Joey Hess has worked for nine years as a Debian developer.
Return to ONLamp.com.