ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Keeping Your Life in Subversion

by Joey Hess
01/06/2005

I keep my life in a Subversion repository. For the past five years, I've checked every file I've created and worked on, every email I've sent or received, and every config file I've tweaked into revision control. Five years ago, when I started doing this using CVS, people thought I was nuts to use revision control in this way. Today it's still not a common practice, but thanks to my earlier article "CVS homedir" (Linux Journal, issue 101), I know I'm not alone. In this article I will describe how my new home directory setup is working now that I've switched from CVS to Subversion.

Subversion is a revision-control system. Like the earlier and much cruftier CVS, its purpose is to manage chunks of code, such as free software programs with multiple developers, or in-house software projects involving several employees. Unlike CVS, Subversion handles directories and file renaming reasonably, which is more than sufficient reason to switch to it if you're already using CVS. It also fixes most of CVS's other misfeatures. Subversion still has its warts, though, such as an inability to store symbolic links and some file permissions, and its need for twice as much disk space as you'd expect thanks to the copies of everything in those .svn directories. These problems can be quite annoying when you're keeping your whole home directory in svn. Why bother?

Benefits of Revision-Controlled Directories

I see three main benefits of keeping my entire home directory in svn:

The first reason originally drove me to using revision control for my whole home directory. It's still the greatest benefit today. I have many accounts on Unix machines scattered around my house, the country, and the planet, and I have an abiding desire for every single one of these disparate accounts to work and look exactly the same. I don't care if the machine I'm logging in to is in Japan or the Netherlands, or a California colocation center, or my home office; I don't care if it's a PC clone, or a Mac, or an S/390 virtual machine; if it's not set up the same as all the others, if I cannot concentrate on the important differences instead of being distracted by the unimportant ones, then I will be less productive. The final ingredient for configuration insanity is that I constantly tweak my setup. As soon as I make an improvement, I want it to be available on every one of my accounts, everywhere. Without Subversion, keeping all these accounts in sync would be well-nigh impossible. With Subversion it's as easy as typing svn up now and then.

Related Reading

CVS Pocket Reference
By Gregor N. Purdy

It seems that the next big change in how we use computers might be the introduction of filesystems that store every old version of every file. With the explosion in size of cheap hard disks, there seems to be no reason not to keep a complete record of your computing life--and several research projects are working on it. Meanwhile, I've done just that for five years, using first CVS and now svn. It's amazing to check out my home directory as it looked on New Year's Day 1999 and play around in it. It's neat to be able to look at the entire revision history of my .procmailrc and watch as I moved mail around, dealt with a growing spam problem, and joined and left many mailing lists. It's handy to be able to run svn diff on my kernel config file to see how make xconfig changed it. I can recover files that I've deleted, or delete files because they're not relevant right now, and know I've not really lost them at all. Amazingly, my Subversion repository is only 4GB in size even with all this historical data.

I have not lost a file since 1999, and I don't intend to ever again. Take one crucial file, like my resume or sent-mail archive. I have a copy of that file on my desktop computer in the .svn directory. There's another copy on my home directory on my laptop, and yet another copy in the Subversion repository on my server thousands of miles away. People tell me that the best backups take no effort--so you actually do them--and are widely scattered among many machines and a lot of area so a local disaster won't knock them out; additionally, they are tested on a regular basis to make sure the backup works. I'm doing all of these things, as a mere side effect of keeping it all in Subversion. To complete the picture, I only have to take very careful backups of my Subversion repository itself. The automated distributed backups via svn keep me sleeping quietly at night. I know that no matter what I do, my life will still be there, safe and secure in svn.

At this point I should fess up to my dirty little secret: not everything is in svn after all. My full home directory with all the trimmings often runs to dozens of gigabytes. Much of that is collections of music files and documentation, which I have not yet dared to check into svn and which I rsync between computers. As disk sizes continue to grow, it looks more and more likely that I will take the plunge soon and check these large file libraries into svn too. Then too I have the occasional file, such as a disk image for a virtual machine, that is too large and too much bother to check into svn. I don't keep incoming mailboxes in svn, because that would lead to a merging nightmare; instead I use offlineimap to synchronize them between several computers. A cron job does check in the mail archives. A few other missing corners are my web browser cache, which I would love to have a history of, and my temporary directory, which I'd rather not.

I have made some progress recently in moving more into svn. I've managed to check the /etc directories of several machines into svn. While this is of questionable value as a way to replicate those machines, and it doesn't include some files such as /etc/shadow, it's useful to be able to check old versions of config files. I've also come up with a way to check crontabs into svn. This is a great improvement, because I can edit and view any machine's cron jobs from anywhere and have all the history and backup benefits of svn. I'm sure that my use of svn will only increase as I find ways to use it in the odd little corners that remain. Yesterday I even found myself checking baby photos into svn for my family's web site.

Directory Organization

I speak of my svn repository, but I actually have several repositories. First, the public one holds most of the nonprivate parts of my home directory and lots of software projects. You can even browse the contents of this directory on the Web at svn.kitenet.net, or check it out anonymously from svn://svn.kitenet.net/joey/. Next I have a private repository that holds such things as my email archives, and I have several other small, special-purpose repositories. I also work on other projects themselves kept in svn on other servers. A full checkout of my home directory will include parts from all of these repositories; the svn:externals feature of svn lets me knit them all together into a whole that I can check out or update with a single command.

I've always managed my home directory with an iron hand. Keeping files in revision control has only exacerbated this tendency. Let's look at the top level:

joey@dragon:~> ls
Maildir/  bin/  doc/  html/  lib/  mail/  src/  tmp/

That really is everything, except for 100-plus dot files. Most people use their home directory as a cluttered scratch space for files they're working on. Subversion works better for this than CVS does, because it lets you easily rename and move files and directories. My tightly controlled home directory is partly personal preference and partly a leftover from my days as a CVS user. Keeping a home directory in Subversion does encourage some neatness, as svn will complain about files that it does not know about. This encourages keeping things organized, or at least out of the way in a temporary directory.

Because my home directory is publicly available online, I have to take care to keep private files private. One tricky area is private dot files. These need to be in my home directory, but I can't keep them in the public repository. To manage this, I keep all the private dot files in ~/.hide, which I store in an entirely different, private Subversion repository.

To make things work, there are symlinks from the private dot files in ~/.hide to my home directory. For this I have a svnfix program that symlinks them into my home directory, fixes some permissions and other symlinks, and even updates my crontab from svn. I have to remember to run this program from time to time, or put a call to it in my crontab, because there is no way to add a client-side hook in Subversion, or CVS for that matter.

External Directories

My ~/.hide directory is just one of several Subversion repositories that Subversion's useful svn:externals feature pulls into my home directory. My ~/src subdirectory, which holds various code projects I'm working on, is an even better example, as some of its contents come from repositories shared with others.

joey@dragon:~> ls src 
Words2Nums/   debconf/      filters/     packages/        sleepd/
alien/        debhelper/    flashybrid/  pdmenu/          tasksel/
apt-src/      debian-cd/    kernel/      sarge/           ticker/
base-config/  debian-edu/   misc/        secure-testing/  unreleased/
d-i/          dpkg-repack/  mooix/       skolelinux/      wmbattery/
joey@dragon:~> svn propget svn:externals src
mooix           svn+ssh://svn.mooix.net/home/svn/mooix/trunk
debhelper       svn+ssh://kitenet.net/home/svn/debhelper/trunk
tasksel         svn+ssh://svn.debian.org/svn/tasksel/trunk
d-i             svn+ssh://svn.debian.org/svn/d-i/trunk
base-config     svn+ssh://svn.debian.org/svn/base-config/trunk
debconf         svn+ssh://svn.debian.org/svn/debconf/trunk/src/debconf
secure-testing  svn+ssh://svn.debian.org/svn/secure-testing

After I use the svn propedit command to add external repositories, svn pulls them in and they become subdirectories in my home directory. They mostly behave as if they were part of the same larger repository. This is a great feature, with uses beyond including directories from other repositories. On many of the machines I use, I don't need my entire home directory checkout, and so my home directory is more minimal.

joey@elephant:~> ls
bin/  tmp/

I use this machine for occasional development. I don't fully trust the machine, so I don't want to put private files there. I have a branch of my home directory that (using svn:externals) includes only the basics and is perfectly usable for everything I normally do on that machine. Using svn:externals like this to pull in optional directories keeps the part of my home directory that I have to branch (and merge) small.

New Machines

When I want to check out my home directory to a new account, I run one of these commands:

% svn co svn+ssh://joey@svn.kitenet.net/svn/joey/trunk/home-base .
% svn co svn+ssh://joey@svn.kitenet.net/svn/joey/trunk/home-full .

The first is the minimal version of my home directory; the other is the whole thing. The dots at the end of the command lines make svn check it out directly into my home directory.

Conclusions

I switched from CVS to svn over a painful couple of months in the winter of 2003. CVS had many misfeatures that made keeping a home directory in it annoying, and I'm glad that I don't have to worry anymore about picking file and directory names (svn can easily rename them; CVS couldn't); that svn can handle binary files well and efficiently (unlike CVS); that svn is quite a bit faster at updating large home directories than CVS is; that managing branches is so much easier with svn that I actually have some branches of my home directory; and that those annoying "CVS" directories that once cluttered up every corner of my home directory have gone away. The transition from CVS to svn would be easier today, as the conversion software has improved, but such a large conversion between revision-control systems is bound to be a slow and painstaking process.

It's interesting to consider that the longevity of my home directory's history has no ties to the useful lifetime of a given revision-control system, or even the lifetime a given computer platform. Converting repositories of past revision-control systems seems likely to be something new systems will continue to support. If one day I switch to arch, or a distant future relative of arch and svn, I fully expect to take my history with me when I do.

Before I go, I want to thank the hundreds of readers who responded to my original article about keeping my life in CVS. Thanks for your encouragement, for your ideas, and for letting me know I wasn't as crazy as I thought. Yes, as you can see, I finally have switched to Subversion! Now I'm off to commit this file....

Joey Hess has worked for nine years as a Debian developer.


Return to ONLamp.com.

Copyright © 2009 O'Reilly Media, Inc.