oreilly.comSafari Books Online.Conferences.


Collaborative Document Editing with svk

by Chia-liang Kao

Say you have a document that needs to be presented in two languages and you are the translator. While the translation is in progress, someone revises the original master document. This means you now might be working with an outdated paragraph or one no longer present in the master version.

This article tries to map this problem to parallel development, which version control systems solve with the branch and merge model. You will also see how svk helps you maintain translated documents easily.

The Problem

Translating a document takes time and is seldom finished in a single session. Unfortunately, an original document sometimes undergoes revisions before the translator is finished. A translator needs to track the original document closely, in order to avoid working on outdated paragraphs or even those that have been removed completely. Even after the translation is complete, if many parts of the original document undergo revision, the translator will have to track numerous changes and can easily get lost if he does not finish all of his adjustments within a single work session.

Do It By Hand

Suppose you want to translate a document into Chinese. You make a copy of the original version, start working on it, and get through the first ten paragraphs. Then you discover that the author made some changes in the original document. Some of them are in the parts that you have translated and some are not. You examine the changes: in the chunks you haven't yet translated, you update by copying and replacing; for those parts you've translated already, you have to read and understand the changes in the original document and adjust your translation accordingly.

For smaller documents that don't change frequently and take only a few work sessions to finish, this is not a big deal. For larger documents where the translation takes longer, odds are greater that someone will update the original document. Repeating the process is boring and wastes the translator's time. Of course, you should use a tool whenever something is repetitive and boring.

Version Control Systems

Version control systems are essential productivity tools for software development. They're often useful for configuration file management too. What do they have to do with translations?

Most version control systems support parallel development on branches, in which two or more teams can work on things separately, merging them together later. Modern version control systems also support merge tracking across branches. You can simply give the instruction to merge everything from A to B, without explicitly specifying which changes to merge. That will merge all the changes that were made to A since the last time you performed a merge into B.

After merging, if you haven't touched the corresponding parts that have changed in A, the system will update your version. Otherwise, if there are conflicts, it will prompt you to resolve them.

In the context of translation, merging can allow B to catch up with the latest modifications from A. Conflicts will occur when someone has updated the original text of translated paragraphs--identifying when you need to update their translations.

svk ( is a new version control system that is easy to use for maintaining translated versions of documents. After all, it's pointless to use software that brings you more overhead than the time you can save.

The rest of this article introduces basic svk usage that will be sufficient for the translation task introduced here. For other features and further information, consult the svk web site.

Installing svk

svk requires Subversion and its Perl bindings, which many Linux and BSD distributions provide as prebuilt packages. After installing them, you can install svk like other CPAN modules:

% perl Makefile.PL
% make all test
# make install

Next, you need to configure the depot, the storage location of files and modifications. The default depot is //. Inside, the depot looks like a normal filesystem, so you will also want to organize things into directories:

% svk depotmap --init
% svk mkdir -m 'this is for articles' //articles

To add items to the depot and modify them, you must first check out //article to your ordinary filesystem:

% svk checkout //articles
% cd articles

Suppose article-en.txt is the file to translate, whether written by you or downloaded from somewhere else. In both cases, putting the article into the depot for version control allows the translated version to more easily track its base. To add files to the depot:

% svk add article-en.txt
% svk commit -m 'initial version' article-en.txt

When the original author changes it, simply overwrite it again and svk commit it.

In the example, we will assume the sample text below is now in article-en.txt:

Manageable Document Translation with svk

Translating articles is really hard work, and you deserve a better
tool.  The tool should be able to tell you which part of the original
text has been updated so you can adjust the translation accordingly.

If the file you are to translate is already under version control somewhere else with Subversion, cvs, or Perforce, svk can mirror it and update to the latest version for you automatically. Consult the svk home page for more information.

There are other basic svk commands you might find useful, such as status and log.

Working on the Translation Branch

Create a branch of article-en.txt by performing a copy:

% svk copy article-en.txt article-zh.txt
A + article-zh.txt

This will create the file article-zh.txt. svk knows it is related to the current version of article-en.txt. You can now work on the translation in the new file. Commit changes often to snapshot your work in progress:

% svk commit -m 'translate first paragraph' article-zh.txt

Larger projects such as books will involve many files instead of just one. In that case, organize them in a directory and copy the whole directory for branching.

Pages: 1, 2

Next Pagearrow

Sponsored by: