ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Converting from CVS to Subversion with cvs2svn

by Brian W. Fitzpatrick, author of
Version Control with Subversion
10/03/2005

For some people, the conversion from CVS to Subversion is as simple as exporting their CVS repository and importing their data into a new Subversion repository. But if you're a digital packrat like myself, you're going to want to take every last byte from your CVS repository when you move to Subversion. Thanks to cvs2svn, you can easily migrate all of your historical data out of your CVS repository. This article will walk you through the technical process of converting your CVS repository to Subversion--from deciding how much data to take with you, to prepping your data, to reviewing the most common options that you'll use in your conversion.

Prepping Your CVS Repository for Conversion

Before you start converting, you may need to do a little housekeeping on your CVS repository. First and foremost, make a copy of your CVS repository and work only with the copy--I can't stress this enough. A lot of the cleanup work we're going to do here can be done after you've converted, but I prefer to do the work before converting as it makes for a "cleaner" Subversion repository.

First, arrange your CVS repository the way that you want it to be laid out in your Subversion repository. You may want to move some projects around or even delete some old cruft entirely. Remember that all of your projects will be placed under the trunk directory in your Subversion repository (there's a way to give each project its own trunk/tags/branches directory, but that's beyond the scope of this article).

Now make certain that the executable bit is set on any files that should be executable (so that cvs2svn will set the svn:executable property on those files). Verify that binary (non-text) files have the -kb flag set. By default, cvs2svn will enable end-of-line (EOL) translation and keyword expansion on CVS files that do not have -kb set, so unless you're disabling EOL translation and automatic keyword detection (more on those later), you're going to want to get this in order.

Deciding How Much Data to Take with You

Brian W. Fitzpatrick
Subversion Tutorial

O'Reilly European Open Source Convention

O'Reilly European Open Source Convention
17-20 October 2005
Amsterdam, The Netherlands

If you're reading this article, I'm assuming that you've already decided that you want to convert at least some of your historical data from CVS to Subversion. With cvs2svn, you can convert anywhere from just your main line of development (i.e. no tags or branches) to every revision in every line of development as well as all of the tags.

If you're a minimalist, and you want historical data, but don't necessarily care about all the tags and branches in your CVS repository, you can use the --trunk-only switch. This saves disk space in your Subversion repository and results in a much faster conversion, but at the expense of losing some of your historical data.

If you want some of your tags and branches converted, but not all, you can use the --exclude switch to instruct cvs2svn to exclude the tags and branches that you don't want. --exclude takes a regular expression, so if, for example, you have hundreds of uniform build tags that you don't want to convert, you can exclude them all with a simple regular expression.

Of course, a full conversion requires no special options to cvs2svn--the default behavior is to convert trunk, tags, and all branches from CVS.

How cvs2svn Works

cvs2svn makes a series of eight separate passes over your CVS metadata, sorting and gathering disjointed sets of CVS revision groups into Subversion commits. The first pass grabs all the revision metadata from the repository, and the last pass pulls the data out of the RCS files in the CVS repository and loads them into a Subversion repository (or if you pass --dump-only, into a Subversion dumpfile). The middle passes do . . . well, they mostly do magic.

One other thing to note is that cvs2svn will create a lot of large temporary files in your --tmpdir, so make sure that you have lots of space. The default value for --tmpdir is the current working directory.

The output of cvs2svn is somewhat verbose, but believe it or not, you can get even more verbose output by using the -v option for what I like to call "pontifical" output. In the following example I use the -q option merely to save space. This is what the output of a successful cvs2svn run looks like:

./cvs2svn -q --dump-only main-cvsrepos
----- pass 1 -----
Examining all CVS ',v' files...
Done
----- pass 2 -----
Checking for blocked exclusions...
Checking for forced tags with commits...
Checking for tag/branch mismatches...
Re-synchronizing CVS revision timestamps...
Done
----- pass 3 -----
Sorting CVS revisions...
Done
----- pass 4 -----
Copying CVS revision data from flat file to database...
Finding last CVS revisions for all symbolic names...
Done
----- pass 5 -----
Mapping CVS revisions to Subversion commits...
Done
----- pass 6 -----
Sorting symbolic name source revisions...
Done
----- pass 7 -----
Determining offsets for all symbolic names...
Done.
----- pass 8 -----
Starting Subversion Dumpfile.
Done.

cvs2svn Statistics:
------------------
Total CVS Files:                29
Total CVS Revisions:            99
Total Unique Tags:               5
Total Unique Branches:           6
CVS Repos Size in KB:           23
Total SVN Commits:              50
First Revision Date:    Fri Jun 18 00:46:07 1993
Last Revision Date:     Tue Jun 10 15:19:48 2003
------------------
Timings:
------------------
pass 1:     0 seconds
pass 2:     0 seconds
pass 3:     0 seconds
pass 4:     0 seconds
pass 5:     0 seconds
pass 6:     0 seconds
pass 7:     0 seconds
pass 8:     4 seconds
total:      6 seconds

cvs2svn tells you what step it's currently working on, and once it's done, it gives you some interesting statistics about your CVS repository, the resulting Subversion repository, and the time it took to do the conversion.

I'd discuss installation, but it's easiest to just download cvs2svn and run it out of the directory in which you unpacked it. You'll also need to have GNU sort installed in addition to the RCS 'co' utility or CVS itself (cvs2svn defaults to 'co' since it's dramatically faster than CVS for extracting data).

A Few Examples

To create a new Subversion repository by converting an existing CVS repository, run the script like this:

$ cvs2svn -s NEW_SVNREPOS CVSREPOS

To create a new Subversion repository containing only trunk commits, and omitting all branches and tags from the CVS repository, do

$ cvs2svn --trunk-only -s NEW_SVNREPOS CVSREPOS

To create a Subversion dumpfile (suitable for svnadmin load) from a CVS repository, run it like this:

$ cvs2svn --dump-only --dumpfile DUMPFILE CVSREPOS

As it works, cvs2svn will create many temporary files in the current directory. This is normal. If the entire conversion is successful, however, those tempfiles will be automatically removed. If the conversion is not successful, or if you specify the --skip-cleanup option, cvs2svn will leave the temporary files behind for possible debugging.

If you need further guidance, first read the online documentation and the FAQ, and if you still need some help, drop by #cvs2svn on irc.freenode.net and you'll usually find a few cvs2svn developers to help you on your way. Good luck!

Author's note: Thanks to C. Michael Pilato and Michael Brouwer for reading drafts of this article.

Brian W. Fitzpatrick is a member of the Apache Software Foundation and currently works for Google. He has been involved with Subversion in one way or another since its inception in early 2000.


Return to ONLamp.com

Copyright © 2009 O'Reilly Media, Inc.