ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Making Packager-Friendly Software
Pages: 1, 2

Configuration Techniques

Before you can build a program from its sources, you have to tune several details to adapt it to your system. Other times, you have to change some default settings so that it fits your expectations. This process is known as source configuration. Believe me when I say that all software packages have some configurable aspect at this stage and that somebody, somewhere, will need to change it; there are very, very, few exceptions. To understand why this is so important, consider the following scenarios:



  • The installation prefix must be changeable. You cannot force a user to install a program in a specific directory. He must be able to choose where the program will end up, because your preferred directory may not meet his administration policies. When discussing packaging systems, consider that the package must follow some layout policies. That is, do not assume that /usr and /usr/local are the only possible locations.

  • There must be no hardcoded paths in the sources. This includes paths to data files, configuration files, extra libraries, and devices. All of these are good candidates for configuration.

    Suppose your program is a simple Perl script; you have to offer the user an easy way to tell it where the interpreter is. Using #!/usr/bin/perl won't work on many systems, as people can install Perl in many other places (for example, /usr/pkg/bin/perl on a default NetBSD setup).

    Perhaps you think, But... #!/usr/bin/env perl" will do the trick, won't it? Yes, it will--sometimes. Consider multiple versions of Perl installed on the system: Perl 5.6's binary is /usr/bin/perl, and Perl 5.8's binary is /usr/local/bin/perl. Now assume that you have a program that requires Perl 5.8, but you used the line mentioned before. What happens? The script will pick up the first Perl binary it finds in the PATH, which may not match the version your program expects.

    Remember, relying on the PATH is, generally, bad. This is why in pkgsrc we always replace such lines with a full path to the real Perl binary. Obviously, you can extrapolate this to any other situation affected by absolute filenames.

    Besides, there are also some programs that try to cover all "known" possibilities to locate the file they are looking for by using paths like /usr/local/somewhere, /usr/somewhere, and /opt/package/somewhere. Simply put, you cannot know where the user has his stuff installed, so you need to let him specify where it is. For example, pkgsrc places all its files under /usr/pkg, but this location is configurable: this may lead to a program working on a system using the default settings, but not on another one that has been modified.

  • There has to be an easy way to choose optional features. If your program includes optional functionalities--such as a GTK front end--there has to be an easy way for the user to enable or disable them. This can occur automatically, but see Automatic decisions below.

Given these reasons, I hope you see the need for a configuration framework in almost all scenarios. Without it, your program is neither portable nor usable, because it will be very problematic to make it work on any system different from yours.

Assuming that this has convinced you, you now have to choose which configuration framework to use. The most common alternatives are:

  • A Makefile with easy-to-change variables. This is an old way to configure software. It consists of a Makefile placed in your source tree with a section where the user can modify some variables to specify paths, system features, and more. This Makefile can either be the same as the top-level one, or one specially designed for configuration.

    This approach works quite well if the amount of customizable features is small and you expect people to install the package manually. Note that many novice users will find this frightening and will probably make mistakes.

    Packaging systems work in an unattended manner, so this framework is difficult to manage. The packager has to patch the configuration Makefile to mark lines to customize; then, the package must run sed(1) over it to replace the previous marks with real values.

    Consider a simple example: if the original Makefile includes a line saying PREFIX=/usr/local, the packager has to change it to PREFIX=@PREFIX@ and then use a regular expression such as s|@PREFIX@|${PREFIX}|g to put the correct value in there. Remember, the installation prefix must be configurable, hence the need for dynamic replacement.

    As you can imagine, these patches easily fall out of sync and must be remade in every update of the package. Using this approach will only discourage packagers to create a package for your program.

  • A configuration script. This is a very common way to configure software and works very well if the script is smart enough. A script gathers all the required information, either automatically or through flags given by the user, to create the Makefiles and other files accordingly.

    These scripts often use GNU Autoconf, which is usually a safe choice because it integrates well with several packaging systems. Of course there are several other frameworks that you can use, and if you have enough energy, you can even create your homegrown script. Be very careful if you do this, though, as it may not be as portable and useful as you may think.

    The configuration script can do much more than the previous approach (a static Makefile): it can check whether required dependencies are present, whether specific functionality exists in libraries, and more. This is definitely the best way to go, even if it requires a bit more of extra work on your side. It will not only simplify packaging, but also make your package nicer to the end user.

From now on I will assume that your program includes a configuration script. If it does not, well, read the reasons again. Keep reading, even if you still resist the idea, as the concepts explained below should apply to whichever method you use.

Configuration Script Tips

As explained earlier, a configuration script adapts the source code of a program to build and work properly on the build host. (I will not consider cross-compilation here, but that is often a focus of problems, too.) What kinds of details must a developer care about to make his creations package-friendly?

  • The script should be noninteractive. Of course, it may require information from the user, but this should be optional and should come from command-line options or environment variables (see the next point).

    Rationale: Passing options to a script from a packaging system is trivial; append them to the call to the configuration script and everything will be automatic. However, if the script requires interaction, the packaging system must simulate it, which may be "easy" if it is command-line oriented--redirecting stdin from a previously stored file--or almost impossible if using a utility such as dialog(1). Other solutions include hand-patching the script, which is equally problematic.

  • Do not hardcode paths or other values in the sources. If you have to put a specific path or value such as a user name or group name in your program, do not hardcode it in the sources. This is a very good candidate to customize from the configuration script, either through automatic detection or through a user-specific flag.

    Rationale: The paths or values you hardcoded may not be acceptable for every system. Remember that not everything is GNU/Linux running on i386.

  • Be careful with hardcoded paths in the configuration script. If you are looking for some file in a running system, you might try some common paths. Nevertheless, let the user override these defaults if necessary.

    Suppose you need to locate the xmlcatmgr utility. An incorrect approach could be to search for it in /usr/bin, then try /usr/local/bin, and at last abort the operation if it's not found. This is incorrect because the application may be present in an unexpected path.

    A better solution is to provide the user a way to override the search patch so that he can explicitly tell the configuration script where to look; for example, in ${HOME}/local/bin:/opt/xmlcatmgr/bin. In the case where the user has not specified a path, falling back to your favorite built-in directories is still a valid option.

    The best solution is to let the user explicitly specify which utility to use. In this example, that could be through a XMLCATMGR variable, which includes an absolute path to the binary.

  • Do not use the == operator when calling test(1). This is a GNU extension, and it breaks on more conservative systems such as NetBSD.

Automatic Decisions

An automated decision is one taken based on the software available on the system at configuration time, without user intervention. They are very harmful, as they make maintenance harder and often lead to incorrect dependency tracking, which is a very serious problem in a package.

As an example, consider the following scenario: your program comes with an optional GTK front end, and your configuration script provides an --enable-gtk-fe={yes,no} flag to specify whether to build it. The default action, however, is to take an automatic decision based on the presence of GTK in the system; that is, if GTK is available, build and install the GTK front end. (To make this more credible, this is what xchat and other programs do.)

This behavior is acceptable, and often very good, if the user is installing your program by hand. Unfortunately, it makes things (very) difficult in the face of package maintainers, especially when the amount of optional features is large (gst-plugins is one such beast).

When a maintainer creates a package for a software program, he must choose a known set of default build options for it. He does this to create the same--or almost equal--binary packages no matter which machine they are built on. The goal behind this is to keep a fixed dependency tree that is easy to track properly. The common procedure to do this is as follows:

  1. Manually analyze the available configure-time options (as given by ./configure --help or as seen in the README file) and the output of the configuration script.

  2. Check which features are optional and decide whether to enable them for the actual package.

  3. Adapt the source package to use only the chosen dependencies, either by giving extra flags to the configuration script or by patching it manually. Doing the latter is often quite difficult (because configuration scripts are pregenerated and unreadable shell code).

As you can imagine, this task is prone to error: it is easy to miss a required dependency, especially if it is unclear (which unfortunately is the case 90 percent of the time). Think, for example, about the yacc and lex utilities: if the packager forgets to add a dependency on them, the end user will probably have trouble building the package. It's even worse if the package finds an extra library and uses it but does not record this fact anywhere. Any mistake here will surely cause trouble to end users, who may experience build failures, extra files being installed, and so on.

Another problem appears when it is time to update the package. The packager has to repeat the same procedure to verify that the package has introduced no new dependencies. If all of them were off (or on!) by default, this could mitigate the pain, but due to the automatic decisions explained above, this causes a lot of headache.

Consider gst-plugins, which I mentioned earlier. This can build a huge amount of plugins depending on the libraries and codecs available on the system. In pkgsrc, we explicitly disable them all through configuration arguments and select them one by one in individual packages (see its Makefile.common). New versions of gst-plugins often come with new modules, so the set of arguments to pass to the configuration script needs manual adjustment on every update.

Now imagine that the packager misses the --disable-arts argument. The aRts plugin (libgstartsdsink.so) will build on some systems but not others due to the automatic detection. If the packager does not have aRts in his system, he will not add a dependency on aRts because he will not notice it. When another user builds it on his aRts-enabled system, aRts will become a dependency; however, this fact will go unrecorded. aRts has become a hidden dependency of gst-plugins. A further removal of the former will mysteriously break the latter. This kind of situation is a very serious problem that comes up over and over again.

What are some possible solutions to this dilemma?

  1. Make the configuration script abort its process when it cannot activate a feature because of missing dependencies. For example, if the default behavior of xchat is to build the GTK front end, abort the configuration process if GTK is not available. (The word default is important here; if the default is to not enable the GTK front end, the script should not care at all about GTK presence.) I know; this solution is too drastic because it makes things difficult to people building by hand (though, if they are building by hand, they should take all the consequences ...).

  2. Add an --enable-packager-mode (or similar) flag. Passing this flag to the configuration script should disable all automatic decisions, as explained in the previous solution. However, if the flag is absent, the script should behave as usual, taking automatic decisions.

In my opinion, you should use the second solution, as it does not intrude and is more flexible. Is it too complex? Not really. The following code snippet adds the --enable-packager-mode in your own GNU Autoconf scripts:

AC_ARG_ENABLE([packager-mode],
              AS_HELP_STRING([--enable-packager-mode],
                             [Change configuration behavior
                              to ease packaging]),
              [if test x"${enableval}" = xyes
               then
                   automatic_detection=no
               else
                   automatic_detection=yes
               fi],
              [automatic_detection=yes])

Assuming that you choose this option, consider how this flag could affect the first example of this section (the one to enable or disable an optional GTK front end):

AC_ARG_ENABLE([gtk-fe],
              AS_HELP_STRING([--enable-gtk-fe=auto/yes/no],
                             [Enable the GTK frontend
                             (default=auto or yes)]),
              [enable_gtk_fe=${enableval}],
              [if test x"${automatic_detection}" = xyes
               then
                   enable_gtk_fe=auto
               else
                   enable_gtk_fe=yes
               fi])

build_gtk_fe=no
if test x"${enable_gtk_fe}" = xyes ||
   test x"${enable_gtk_fe}" = xauto
then
    PKG_CHECK_MODULES(GTK,
                      [gtk+-2.0 >= 2.6.0],
                      [build_gtk_fe=yes],
                      [if test x"${enable_gtk_fe}" = xyes
                       then
                           AC_MSG_ERROR([GTK not found, but
                                         the GTK frontend
                                         is explicitly
                                         enabled])
                       else
                           build_gtk_fe=no
                       fi])
fi
AM_CONDITIONAL(BUILD_GTK_FE,
               [test x"${build_gtk_fe}" = xyes])

What's Next

This article has introduced the problems of software packaging and why developers should be aware of them. It discussed multiple problematic issues that can be usually found in the distribution files and in documentation. Finally, it analyzed in detail the need for configuration scripts, techniques to implement them, and multiple problems that arise during their creation.

The next article will focus on the build infrastructure used by third-party packages, as well as some code portability issues. Until then, if you are the maintainer of a specific software project, you have enough time to apply all the tips explained. Time to work!

Julio M. Merino Vidal studies computer science at the FIB faculty in Barcelona, Spain.


Return to ONLamp.com.



Sponsored by: