ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Gettext

by Joao Prado Maia
06/13/2002

Understanding the Problem

Did you ever get into a situation in Web development where you need to create a Web site or a Web application that is dynamically available, in several languages? A lot of existing open source applications try to create their own solution for these needs, but the standard way to do this is to use Gettext, a set of GNU tools to help packages manage multi-lingual messages in their applications.

The majority of open source projects, such as Xchat and others, use Gettext to translate the messages and strings shown in their user interface to several languages. The same concept can easily be applied to a Web site or Web application, and that is the objective of this article.

Requirements

So what you are looking for right now is a way to enable the Gettext PHP extension into PHP itself, to have access to its functions. If you are using Windows, you probably already have the Gettext DLL and only need to change your php.ini configuration file to enable this extension.

Do so by removing the semicolon from the front of the line where php_gettext.dll is located. After that, save the file and restart your Web server software. You will then be able to use and test the code snippets found in this article.

If you are using Linux, BSD, or any other UNIX operating system, there are more options to configure this extension. The safest bet is to get the appropriate package from your distribution vendor, like the RedHat RPM or Debian package.

Web Database Applications with PHP, and MySQL

Related Reading

Web Database Applications with PHP, and MySQL
By Hugh E. Williams, David Lane

Gettext 101

To simplify, the Gettext PHP extension allows you to dynamically translate strings in your PHP code by using the gettext() function to get the appropriate translated string. If the string is not translated yet, the original one is used instead.

A simple example follows:

<?php
// I18N support information here
$language = 'en';
putenv("LANG=$language"); 
setlocale(LC_ALL, $language);

// Set the text domain as 'messages'
$domain = 'messages';
bindtextdomain($domain, "/www/htdocs/site.com/locale"); 
textdomain($domain);

echo gettext("A string to be translated would go here");
?>

Setting Up the Gettext Files

Gettext works by expecting a locale directory where all of the translated strings are kept, in the following structure:

/locale
    /en
        /LC_MESSAGES
            messages.po
            messages.mo

So in the case of a Web site, the locale directory would be inside the webroot. Or not; it's totally up to you, as the bindtextdomain() function showed above.

The directories you can create yourself, always remembering to create one language subdirectory for each language for which you wish to show translated strings. For instance, if you want to translate your site to Brazilian Portuguese (pt_BR), you would need to create a pt_BR subdirectory and assign the proper language code in your PHP code, like this:

<?php
// The language code goes here
$language = 'pt_BR';
putenv("LANG=$language"); 
setlocale(LC_ALL, $language);

// ....
?>

So after creating the new pt_BR subdirectory, your directory structure would look somewhat like this:

/locale
    /en
        /LC_MESSAGES
            messages.po
            messages.mo
    /pt_BR
        /LC_MESSAGES
            messages.po
            messages.mo

After you have the directories all prepared, it's time to create the actual "pot" file, as it is usually referred to: the messages.po file. To do this, you will need to have PHP files that use the gettext() function to "mark" strings to be translated and use the xgettext command.

$ xgettext -n *.php

The line above will create a messages.po file, with some strings to be translated. It will look close to the following:

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR Free Software Foundation, Inc.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2002-04-06 21:44-0500\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

#: gettext_example.php:12
msgid "A string to be translated would go here"
msgstr ""

This file contains all of the strings found inside gettext() calls, and it is used by the translators of the respective languages to translate the application (or Web application, in our case).

Distributing the Pot File

Ok, so now that you have your pot file with the strings that need to be translated, you need to distribute it to your translators. In a successful open source project, you would have several different volunteers that take care of the translation of your user interface messages. In this case, you would send an email to the development mailing list and tell the volunteers that the next release would go out on the X date and that the following list of languages need to be updated.

That's, of course, in a successful project. The normal situation, though, would have you, the project leader, to do the most important translations yourself and wait for contributions from volunteers. In a Web site or a Web application, this is usually not the case. That is, you normally wouldn't have a team of volunteers working on your personal Web site, but the picture is different for a community Web site like themes.org, for instance.

In any case, either you or the volunteers will translate the pot file and then you will need to convert the file into a binary file that Gettext actually "understands." For that you would use the following command:

$ msgfmt messages.po

The line above will create a messages.mo file, which you should save in the appropriate locale/<LANG_CODE>/LC_MESSAGES/ ng strings y.

Managing Evolving Pot Files

Now think about all of this for a moment -- you have a Web site that is constantly evolving, with new features being added or inconsistencies being removed, and your strings to be translated are also changing. New ones are added, old ones are modified, and so on.

So how do you manage multiple versions of the pot files? In a usual situation, you will have a messages.po that is completely translated for a specific language, and a new file with the new strings to be translated. The problem lies here: since Gettext doesn't work in any other way, this new file will look just like the example above -- it will be empty.

The question is how to merge the files in a way that keeps the already- translated strings while adding the new untranslated ones. The answer is provided by the msgmerge Gettext utility. An example of its use would be the following sequence of commands:

$ ls
example.php
$ xgettext -n *.php
$ ls
example.php   messages.po
// ...
// Translates the messages.po file now
// ...
$ msgfmt messages.po
$ ls
example.php   messages.po   messages.mo
// ...
// Changes the example.php file
// ...
$ mv messages.po old.po
$ xgettext -n *.php
$ ls
example.php   messages.po   messages.mo   old.po
$ msgmerge old.po messages.po --output-file=new.po
$ ls
example.php   messages.po   messages.mo   new.po    old.po
// ...
// Translates the new.po file
// ...
$ msgfmt new.po

What to Do Next

The best thing to do now is to experiment. You will need to enable the Gettext PHP extension and start to play around with it. The example provided here is simple but should be enough to build applications for multiple languages.

More information can be found in the Gettext manual.

Have fun.

Joao Prado Maia is a web developer living in Houston with more than four years of experience developing web-based applications and loves learning new technologies and programming languages.


Return to the ONLamp.com.

Copyright © 2009 O'Reilly Media, Inc.