When a project hits a certain point in its life cycle, the unpleasant issue of backward compatibility begins to rear its ugly head. All of a sudden the changes introduced in a new release of the software have a dark side to them; they hold hidden possibilities that will break something one of your users depends on. This is true in both open and closed source projects, but in the open source world it seems that the community has spent less time worrying about it than in the closed source world.
That is understandable; the closed source world has paying customers who complain loudly when they upgrade to a new version of a piece of software and something breaks, but the same issues occur in open source software as well. For the world to take open source programs seriously, you must deal with the issue of backward compatibility.
In order to better prepare you, the average open source hacker, for dealing with this problem, I'd like to share some of the experiences we've had with backward compatibility in the Subversion project. With luck, you'll be able to apply some of the lessons we've learned to your own projects. Everyone--developers, redistributors of your project, and, most of all, your users--will benefit in the long run.
Before talking about backward compatibility, I want to explain the context of the examples, because without the context surrounding the project you can't make any real decisions about backward compatibility. What might be appropriate for one project can be totally wrong for another. This article draws examples from the Subversion project, a version control system designed to replace CVS, the current de facto standard in open source version control systems.
Subversion provides a client-server architecture that uses several separate network protocols that provide access to the Subversion server. The core Subversion libraries are written in C, with a command-line client providing the primary user interface. Additionally, bindings are provided that let you make use of the C libraries in various other languages. In several cases, non-Subversion clients speak the same protocol Subversion does, so there is a need to provide backward compatibility with users of the Subversion libraries and with programs that simply implement the protocols themselves.
Subversion has declared itself to be at a 1.0 level of quality and has a large user community. The project is now in a position where any break in backward compatibility will likely bring with it significant pain for users. Multiply this by the fact that a popular use of Subversion is to store source code, the most precious asset for a software developer, and so anything that makes it impossible for a user to access that data is even more critical a problem than you are likely to see in an average project.
|
Related Reading
Version Control with Subversion |
Backward compatibility is really a series of promises to your users as to what they can expect when they upgrade to a new version of your software. Those promises often fall into three categories.
First is the promise that users can move both forward and backward in versions of a software package without incompatibilities. This is the strongest kind of promise and is very hard to maintain. As a result, many projects use it only for small upgrades. Subversion provides this kind of compatibility promise for patch releases: for example, the change between version 1.0.0 and 1.0.1. That means I can install version 1.0.0 of Subversion, use it for a while, upgrade to version 1.0.1, use it for a while, and then revert to 1.0.0, all with no ill effects.
Next is the promise that users can move forward to new versions of the software, though they may not be able to move backward again once they do. Subversion makes this promise for minor version changes, so people can move from any of the 1.0.x series of releases to any of the 1.1.x series of releases without any problems. However, they cannot necessarily move back. It requires some discipline to provide this kind of backward compatibility, but it's infinitely easier than the previous kind.
Third is the case in which there is absolutely no promise whatsoever. For obvious reasons, you want to avoid this kind of situation, because it means that the process of moving to a new version of the software is harder. Any difficulty in moving to a new version makes it more likely that people will just continue to use the old version. There's little point in releasing a new version of your project if nobody will move to it. Subversion reserves this kind of promise (or lack of promise, really) for major version number changes. For example, the move from Subversion version 1.x.x to 2.x.x will most likely have no promise of backward compatibility between the two versions.
Once you've thought about the level of compatibility you want to promise to your users, the next step is to think about the actual places your project needs to worry about specific kinds of compatibility problems.
The first kind of compatibility most people think about is API compatibility. This means that subsequent versions of a library provide the same API that previous versions do, so programs written against the previous version will still be able to compile and run with the new version. In addition to actually leaving the same functions around, this also implies that those functions all do the same thing in the newer version that they did in the older ones. Of course, there is some flexibility here, as at some point you have to be able to say, "This behavior is a bug, so we changed it"; otherwise, there is little point in releasing new versions of your software.
Second is Application Binary Interface, or ABI, compatibility. This means that backward compatibility is preserved at the level of the binary object code produced when you compile the library. There is usually some overlap between API and ABI compatibility, but there are important differences. To maintain ABI compatibility, all you have to do is ensure that your program exports all of the same symbols. This means all the same functions and globally accessible objects need to be there, so that programs linked against the prior version will still be able to run with the new version. It's possible to maintain ABI compatibility while breaking API compatibility. In C code, leave symbols in the C files but remove them from the public headers, so new code that tries to access the symbols will fail to compile, while old code that users compiled against the previous version will continue to run.
If your program communicates over a network, it has to deal with a third form of compatibility, client-server protocol compatibility. This means that a client using the version of the network protocol provided in the older releases will continue to function when faced with a newer server, and that newer client programs will continue to work with an older server. Of course, in some cases this is impossible. A new feature might require support on both the client and server side of the network. In general, though, existing functionality should continue to work with various combinations of client and server code.
Finally, if your program stores data somewhere, be it in a database or in files on disk or wherever, there is data format compatibility. Newer versions of the code need to be able to work with data files written out by older versions, and vice versa. Ideally you should also be able to build some forward compatibility into data formats. If your file-handling routines can ignore and preserve unrecognized fields, then new functionality can modify data formats in ways that do not break older versions. This is one of the most critical kinds of compatibility, simply because users become very upset when they install a new version of a program and suddenly cannot access their old data.
If your project provides code libraries that other programs make use of, you will almost certainly have to worry about API compatibility. If you work in a language that produces object code other programs link against, you will either have to worry about ABI compatibility or require your users to recompile their code for every upgrade. If you make use of a network, you probably need to consider client-server protocol compatibility; and if you store any data at all, you will have to worry about data format compatibility. A nontrivial project like Subversion has to worry about all four of these issues.
With the various types of compatibility in mind, it's now time to delve into the actual techniques for maintaining them. This is not an exhaustive list by any means. Not every technique mentioned may apply to your particular project, but they're all worth learning about.
The cardinal rule of maintaining API and ABI compatibility is that you must never remove anything from your interface. As soon as you expose a new function or data structure to your users in a public release, you have committed yourself to supporting it until your compatibility rules allow you to change or remove it, probably in your next major revision.
That means that if you want to make modifications to an API, you need to retain the existing version and add a new one alongside it. The old version remains intact, both in the public interface (such as in the header files for a C or C++ program) to ensure source-level API compatibility and in the binary object files to ensure link-level ABI compatibility.
Subversion has used this technique several times, specifically when we've needed
to add new arguments to existing functions. For example, in Subversion 1.1.0
the svn export command gained the --native-eol
argument, which allows you to specify your platform-specific end-of-line
character sequence. This allows you to simulate the effect of exporting a
project on a platform that has a different end-of-line sequence than the
platform you are running Subversion on. Under the hood, this required creating
a new svn_client_export2 function, which is identical to the
previously existing svn_client_export function but with an
additional argument for specifying the native end-of-line style. The existing
svn_client_export function continues to exist but now simply
calls svn_client_export2 with a NULL end-of-line
style. This allows the project to add new functionality while still enabling
third-party code that makes use of our APIs to continue to compile and run
without changes.
Perhaps the simplest way to avoid problems when making changes to an interface is to not expose that interface to the users. This is difficult to do when you're talking about a function, because to make use of the API the caller really needs to know its name and signature. When you're talking about data structures, it often becomes a viable option.
In a C program, exposing the definition of a structure to your client means much the same thing as exposing the declaration of a function, but with one additional trick. A structure's definition includes not only its name and its contents, but also its size. If you want to maintain ABI compatibility, you cannot add or remove fields from the structure because a client might depend on that size.
For example, if you define a struct that contains two integers and put that definition in a public header file, there's nothing to keep your clients from declaring an instance of that struct on the stack in their own code. Suppose that you later add a third integer to the structure and a client drops in a new version of your library without recompiling the code. It's likely that the client's code will pass a pointer to the structure it declared on the stack into your code, and then you're in trouble, because your code compiled with the knowledge that the structure is three integers large, but the client allocated only two integers' worth of space for it, so any of your code that tries to access the third integer in the structure will access random memory somewhere, which can only lead to problems.
Other programming languages often have more elegant ways to deal with these issues. For example, Java and C++ support various means for restricting access to the internals of an object, and many languages require that all access to objects go through pointers, so the size issue is less of a problem. In raw C you really have to deal with it yourself, though, as the language provides little help.
You have a few ways to avoid these kind of problems in C. First, you can simply not give the client access to the definition of the structure. Instead, forward-declare the structure so that the client can pass around pointers to them, leaving the actual definition inside your library. All access to the structure must go through functions that you define in your library, allowing you to change them should the internals of the structure ever change. This is the opaque pointer technique, and it's probably the best way to solve these sort of issues.
Within the Subversion libraries, several places make use of opaque pointers.
One of the most prolific is the working copy library's access baton object,
svn_wc_adm_access_t. The access baton's forward definition is in
svn_wc.h. The client creates them by way of the svn_wc_adm_open
function (or, in versions of Subversion later than 1.0.x, the
svn_wc_adm_open2 function, another example of providing a new
function in a backward-compatible manner). All the public functions in
libsvn_wc simply accept opaque pointers to the structure.
Internally, the only place that actually has access to the definition of the
structure is the file subversion/libsvn_wc/lock.c. Note that the same
technique that keeps client code from poking around in the internals of the
access baton also keeps the rest of the Subversion libraries from doing so,
thus making it easier to modify the structure without modifying other parts of
libsvn_wc.
If you absolutely need to leave the definition of a structure in public header files, but you still want to preserve the ability to change the structure later, there is a way to do it. Before doing so, keep in mind that it's much more fragile than just using an opaque pointer in the first place, mainly because it requires that your clients "do the right thing" without using any technical means to enforce that they do so. The trick is simply to provide a function that allocates the structure for you and to document that the only valid way to obtain access to an instance of the structure is to use that function. Because only your library will ever allocate the structure, you can be sure that there will always be enough memory to hold the entire thing, even if you increase its size later.
This technique has a few more gotchas. First, you can add only new fields to the structure, and you must add them to the end of the structure because clients compiled using previous versions of the definition will assume the offset of old fields within the structure are the same as they were in previous versions. Second, when adding fields to the structure you must ensure that code using the fields can deal with the new fields' not being initialized; otherwise, providing compatibility for old clients that know nothing about the fields is futile. Finally, as I already explained, there's nothing that enforces the use of the constructor functions, so it's still possible for clients to shoot themselves in the foot with this technique. If that bothers you, an opaque pointer is almost certainly a better solution.
For an example of how to provide a constructor function for a publicly
defined structure, see the svn_client_create_context function in
svn_client.h. Please be cautious with this technique, though, as it
really is dangerous to rely on your users to do the right thing. In retrospect
(and I can say this because I wrote the client context code in Subversion and
made it a publicly defined structure in the first place), I think going with an
opaque pointer would have been worth the extra effort of creating the necessary
accessor functions. That's because I have encountered users of this code who jump right past the documentation that states that you must use
svn_client_create_context and allocate their own instance on the
stack, defeating the purpose of the constructor function and setting themselves
up for pain if and when we finally add new elements to the client context
structure.
|
The best thing you can do to ensure that you maintain protocol and data format compatibility is to plan ahead by designing your protocols and data formats so that you can add things to them in the future without disturbing prior versions of the code. This means you need to be able to add new elements to your files or data streams that your code can ignore if it doesn't understand them and that new code needs to be able to deal with the absence of the new elements.
The canonical place that Subversion uses this technique is within the XML
data formats used in the working copy libraries (for example, the
.svn/entries files) and the DAV-based network protocol used by
libsvn_ra_dav and mod_dav_svn. I'm not the biggest
fan of XML, but it does make it pretty simple to create formats and protocols
that can be extended later with a minimum of problems.
Specifically, the use of XML in libsvn_ra_dav has simplified
the process of adding parameters to functions in the repository access API. For
example, when I added the --limit parameter to the svn
log command, I had to transmit that parameter to the server so it could
pass on to the libsvn_repos-level log functions. Because the
functions in question simply send a report to the server in XML form, all that was required was to add a new element containing the parameter. New servers simply
look for the new element, and if it isn't there, they assume it wasn't sent,
preserving compatibility with old clients. Old servers ignore the new element
because they don't understand it, and the client code simply recognizes the case
by noticing when it has received more than the requested number of log entries
and ignoring the rest, allowing the new parameter to work even with a server
that does not understand it.
Of course, you don't need to use XML in order to ensure forward and backward
compatibility in your protocols. Subversion's libsvn_ra_svn and
svnserve have a custom protocol that uses many of the same tricks
you might use in an XML-based protocol. The svn:// protocol sends
data across the network encoded in tuples: lists of items that are known to
contain certain items. The functions for reading tuples off the wire ignore
extra entries in the tuple, so you can add new parameters and old servers will
ignore them, just like we did in the DAV-based format.
Additionally, the svn:// protocol includes in its initial
handshake a minimum and maximum protocol version and list of capabilities
supported by the server and client. Thus, both the client and server have a chance
to adjust to the exact version of the protocol being spoken at the other end of
the connection. This allowed the addition of pipelining to the protocol shortly
before the release of Subversion 1.0 while ensuing that old clients continued
to work. See the subversion/libsvn_ra_svn/protocol file for more
details on how the svn:// protocol works.
For forward-compatible but not backward-compatible changes, what's most important is to provide a smooth update path. There are two main ways of doing this, both of which Subversion has used at various times.
Long before Subversion hit 1.0, the developers made the decision to change
the format used when storing timestamps in the working copy code. The change
occurred slowly, over the course of a few releases. First came support
for reading the new format, so the code that parses timestamps would try the
new format, and if that failed it would try the old format before finally
returning an error if that failed. Then, after that code had been out in the
wild for a while, libsvn_wc changed so that it wrote out
timestamps in the new format. The pre-1.0 policy for upgrades was to ensure
compatibility only within a single version. Because the support for reading the
new format went in a version before the introduction of support for writing the
new format, the project retained that compatibility. Support for the old date
format exists to this day in Subversion's timestamp-parsing code, but nothing
has written out dates in that format in quite some time.
What's important to keep in mind here is that the slow introduction of change allowed the users the ability to revert from the new version (which produced the new format timestamps) to the previous version (which knew how to read them) on the off chance that they encountered some sort of problem with their upgrade.
The addition of UUIDs to Subversion repositories is another example of how
to change an on-disk format in a backward-compatible way. Originally
Subversion repositories did not have any unique identifier; features like
svn switch --relocate were dangerous because you couldn't ensure
that both URLs referred to the same repository. To solve the problem, each
repository now has a universally unique ID stored in a new database table
(because at the time, the only filesystem back end that existed was the Berkeley
DB one). To ensure that new code worked with repositories created prior to the
addition of this feature, the lack of this table simply caused the function
that returns the repository's UUID to create the table itself, seamlessly
upgrading the repository without the user ever being aware of it.
The important item to note here is that if you can possibly make the upgrade automatic from the point of view of the user, then you should do so. Avoiding manual steps can be only a good thing.
One place where it's easy to forget about compatibility problems is in your project's dependencies. Any libraries you link against or external programs you use will each to have their own compatibility issues, just as you will. It's important to be aware of those issues when deciding to make use of a third-party product. In Subversion we've had at least three separate dependencies that cause compatibility problems. Some are internal to Subversion, and some poke through to users from time to time.
The most important kind of dependency you need to worry about is one that shows up in your public API. This can happen when you use data types defined in the library as arguments to your library's functions, such as with the Apache Portable Runtime (APR) in Subversion. Any non-backward-compatible change that occurs in the library you depend on will instantly affect you as soon as your users try to upgrade to a new version of the dependency.
When Subversion first hit 1.0, the only released version of APR was from the 0.9.x series of releases. Because Subversion uses APR in almost every part of its public interface, this means that to maintain ABI compatibility, all releases of Subversion within its 1.x branch only officially support the use of APR 0.9.x releases. While Subversion does happen to work with APR 1.0.x, official builds use 0.9.x.
The primary reason Subversion can't use APR 1.0.x is that the size of the
data type apr_off_t has changed from off_t (often 32
bits long on a 32-bit system) to long (often 64 bits long on a 32-bit system). This support was necessary for interoperating with programs (Perl,
for example) that redefine the size of an off_t via the
_FILE_OFFSET_BITS define. Because apr_off_t shows up
in the public Subversion API, this change makes versions of Subversion compiled
with APR 1.0.x instantly incompatible with versions compiled with APR 0.9.x.
Additionally, APR uses a set of compatibility rules that allow it to drop and
change parts of its public API between major versions, so any of those kinds of
changes will cause similar types of problems as the apr_off_t
changes.
The important lesson to learn from this is that as soon as you let into your program a data type defined by your dependency, its compatibility issues instantly become your compatibility issues.
An interesting counterexample in Subversion's case is the Neon library, which Subversion uses as its HTTP/WebDAV client library. Neon differs from APR in two ways. First, Neon doesn't make it into Subversion's public interface, so changes to Neon's data types have a harder time making themselves seen to clients of Subversion itself. Second, Neon's interface is far less stable than APR's is. Even APR 0.9.x, despite its pre-1.0 version number, provides a high level of stability in its API. Neon has never professed to do so, with nontrivial changes in its API being reasonably common.
This means that in order to support multiple versions of Neon, Subversion
needs to jump through a few hoops. That has happened at least once, with
nontrivial amounts of shim code being introduced to libsvn_ra_dav
in order to account for changes in the Neon API as a result. This allowed
Subversion to function with either the old Neon API or the new one for a
reasonable amount of time while users upgraded.
While backflips like the shim code in libsvn_ra_dav ease the
burden on its users, they don't solve all the problems. If a program uses Neon
directly in its own code as well as the Subversion API's, it's possible for Neon
upgrades required by Subversion to break backward compatibility. It's not
clear yet the best way to handle this kind of change.
Again, it is important to note that this is a valuable lesson. Once you use a library, its compatibility issues are your compatibility issues.
The last type of compatibility problem that a third party library can introduce is when the library is responsible for the on-disk format of some of your data, as in the case of Berkeley DB as used by Subversion. Upgrading to a new version of the library can result in unexpected problems if the disk formats are incompatible. This has resulted in significant issues, mainly because Berkeley DB upgrades often require manual intervention, ranging from a full dump/load cycle to a simple recovery. Vendors and distributors often package Berkeley DB so upgrades may occur without the user's conscious action.
There's not much more to say about this kind of compatibility problem other than the fact that the only real solution is education. Users need to understand the issues upgrades can bring, and ideally the problems that result from them need to specify what has gone wrong. Unfortunately, users often feel terror at the sudden inability to access their data, so panic may outweigh education in some cases.
Now that you've learned about the types of compatibility, seen some tricks you can use to help maintain them, and heard about some specific examples of how such problems can occur, it's time to think about your specific application and how these issues apply to you.
First, consider your user base. If you have only a dozen highly technical users, jumping through hoops to maintain backward compatibility may be more trouble than it's worth. On the other hand, if you have hundreds or thousands of nontechnical users who cannot deal with manual upgrade steps, you need to spend a lot of time worrying about those kinds of issues. Otherwise, the first time you break compatibility you'll easily burn through all the goodwill you built up with your users by providing them with a useful program. It's remarkable how easily people forget the good things from a program as soon as they encounter the first real problem.
Next, consider your project. If you don't actually provide a library your users embed in their own application, worrying about API and ABI stability is pointless. Similarly, if you don't store data on disk or send it over the network, the issues associated with those activities are moot. It's rare that a program has no compatibility issues at all, but it's also rare for one to encounter all the issues described in this article.
Consider again the example of the Subversion project. Subversion's compatibility policy appears in the "Release numbering, compatibility, and deprecation" section of the HACKING file in the top level of its source tree. You can upgrade and downgrade within a single minor release cycle without issue. You can upgrade to new versions in the same major release cycle without issue. When the major version number changes, all bets are off. These rules apply to API/ABI issues, data format issues, and network protocol issues.
Has the project followed this policy? The answer--as is often the case with software engineering--is a qualified yes. Subversion has in one instance added a function in a nonminor release as part of a change to fix a security problem that broke the ability to go back and forth within that specific minor version. The nature of the security problem meant sacrificing compatibility in this particular case.
Other than that, though, the policy has been a success. Users have upgraded to new versions of Subversion without fear. Various versions of the official client and server, and even third-party clients that implement the same protocols, have also enjoyed continued compatibility. The users seem happy with the compatibility promises, and the developers are not overly hampered by them. It isn't always easy, but in my opinion it's been worth it.
All projects need to consider compatibility. The issues are rarely as simple as you might like, and they require serious thought for each project, as no two are the same. Finally, be aware that worrying too much about compatibility can cripple you, so it's important not to place too high a price on it. Only you can determine how high is too high. I hope this article has given you a starting place for making that determination.
Garrett Rooney is a software developer at FactSet Research Systems, where he works on real-time market data.
Return to ONLamp.com.
Copyright © 2009 O'Reilly Media, Inc.