Python DevCenter
oreilly.comSafari Books Online.Conferences.


Python News BitTorrent Style

by Stephen Figgins

Where bandwidth is a bottleneck, spreading the load of downloading a file can speed up effective transfer rates . With Bram Cohen's BitTorrent, every person downloading a file contributes a bit of bandwidth by exchanging portions of the file they want with other downloaders. As you download one portion, you upload another. This creates a simple tit for tat accounting system; for each little bit you provide, you get a little.

From the end user's perspective it's not very different from any other download. You click a link to a torrent file, your browser passes that to the BitTorrent program which takes it from there. The simple user interface reports your progress and how fast you are both downloading and uploading information. "That's the goal: to make it just work," says Bram Cohen. "There is a lot of technical magic going on under the hood, but as an end user experience, I'm a firm believer that the interface should be that 'it just works.'"

That technical magic is implemented in Python and it's open source. I love learning how a good magician does their tricks. Unfortunately while it's written in very clear Python, the magic isn't easy to grasp. BitTorrent is comprised of many heavily encapsulated pieces. Cohen explains, "when it starts up [BitTorrent] instantiates them and hooks them up to each other. As a result, when looking at the BitTorrent code base there is no particularly coherent concept of 'start here.'" Cohen says the key to understanding BitTorrent is to first understand what the code is doing, that is, to understand the high level concept. He suggests reading the paper, Incentives Build Robustness in BitTorrent (PDF), which he presented at the P2P and Economics conference.

Aside from understanding the magic, you might find Cohen's style itself worthy of study. Cohen emphasizes his use of encapsulation. "It allows you to change one module without having to change all the other modules. With well thought out encapsulation boundaries I can leave the interface the same, while the code behind it is thoroughly rewritten." This and other principles of Cohen's style are described in an essay he wrote for Advogato, How To Write Maintainable Code.

Also in Python News:

Twisted Python

Python Escapes Classroom

Humongous Python

Py in Print

Wrapping Web Service APIs

While Cohen confesses everyone promotes modularization in software design, he finds few people actually do it. In contrast, he says he uses it an extreme amount. He says, "[It] enables me to write lots of unit tests. As a result of which there have been very few bugs [..] despite my not doing system testing."

A less common approach that both makes BitTorrent harder to grasp, but worthy of study, is Cohen's use of idempotence. A process is idempotent when applying it more than once causes no further changes. Cohen says he uses a design pattern he calls "Fix Everything," a function that can react to a number of changes without really noting what all it might change. He explains, "you note the event that happened, then call the fix everything function which is written in this very idempotent manner, and just cleans up whatever might be going on and recalculates it all from scratch." While idempotence makes some difficult calculations easier, it makes things a little convoluted. It's not always clear what a call is going to change, if anything. You don't need to know in advance. You are free to call the function, just to be on the safe side.

Cohen says his work is a bit unorthodox. "The Bittorrent code base does not is in direct violation of many self-appointed guru's notions of how software should be written, but there is method to the madness." His diversions are all aimed at making the code easy to maintain. "Once code is debugged and performs well, then the only real measure of how good it is is how maintainable the code base is."

Though every bit as novel, the protocol used by BitTorrent is less confusing than its implementation. It's now sufficiently well defined that other programmers are beginning to implement BitTorrent peers in their own languages. There are efforts underway to produce Java, Perl, and C++ implementations of BitTorrent peers, and there is an effort to create an extensible C implementation of the BitTorrent protocol, libtorrent.

Cohen's plan for BitTorrent is to enjoy the maintainability of his code. He intends to refactor important modules to make the canonical Python implementation of BitTorrent faster and to enhance the interface to provide additional statistics.

Stephen Figgins administrates Linux servers for Sunflower Broadband, a cable company.

Read more Python News columns.

Return to Python DevCenter.

Sponsored by: