PHP DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


PHP Cookbook

Internationalization and Localization with PHP

by Adam Trachtenberg, coauthor of PHP Cookbook
11/28/2002

While everyone who programs in PHP has to learn some English eventually to get a handle on its function names and language constructs, PHP can create applications in just about any human language. Some applications need to be used by speakers of many different languages. PHP's internationalization and localization support makes it easier to make an application written for French speakers useful for German speakers.

Internationalization (often abbreviated I18N--there are 18 letters between the first "i" and the last "n") is the process of taking an application designed for just one locale and restructuring it so that it can be used in many different locales. Localization (often abbreviated L10N--there are 10 letters between the first "l" and the "n") is the process of adding support for a new locale to an internationalized application.

Localizing different kinds of content requires different techniques. This article covers an object-oriented method for localizing plain text messages and images. The PHP Cookbook contains additional recipes for dates, times, and currency. There are also recipes on using GNU gettext and other I18N and L10N topics.

Related Reading

PHP Cookbook
By David Sklar, Adam Trachtenberg

Locales

A locale is a group of settings that describe text formatting and language customs in a particular area of the world. A locale name generally has three components. The first, an abbreviation that indicates a language, is mandatory. For example, "en" stands for English and "pt" for Portuguese. An optional country specifier comes next, after an underscore, to distinguish between different versions of the same language spoken in different countries. For example, "en_US" and "en_GB" specify U.S. and British English respectively, while "pt_BR" and "pt_PT" identify Brazilian and Portugese Portuguese. Finally, after a period, comes an optional character-set specifier. Taiwanese Chinese using the Big5 character set is encoded as "zh_TW.Big5". Note that while most locale names follow these conventions, some don't.

Message Catalog

To incorporate I18N support into your program, maintain a message catalog of words and phrases and retrieve the appropriate string from the message catalog before printing it. Here's a simple message catalog with foods in American and British English and a function to retrieve words from the catalog:

<?php
$messages = array (
    'en_US'=> array(
       'My favorite foods are' =>
           'My favorite foods are',
       'french fries' => 'french fries',
       'biscuit' => 'biscuit',
       'candy' => 'candy',
       'potato chips' => 'potato chips',
       'cookie' => 'cookie',
       'corn' => 'corn',
       'eggplant' => 'eggplant'
    ),

    'en_GB'=> array(
        'My favorite foods are' =>
            'My favourite foods are',
        'french fries' => 'chips',
        'biscuit' => 'scone',
        'candy' => 'sweets',
        'potato chips' => 'crisps',
        'cookie' => 'biscuit',
        'corn' => 'maize',
        'eggplant' => 'aubergine'
    )
);

function msg($s) {
    global $LANG;
    global $messages;
    
    if (isset($messages[$LANG][$s])) {
        return $messages[$LANG][$s];
    } else {
        error_log("l10n error:LANG:" . 
            "$lang,message:'$s'");
    }
}
?>

This short program uses the message catalog to print out a list of foods:

<?php
$LANG ='en_GB';

print msg('My favorite foods are').":\n";
print msg('french fries')."\n";
print msg('potato chips')."\n";
print msg('corn')."\n";
print msg('candy')."\n";
?>

My favourite foods are:
chips
crisps
maize
sweets

To have the program output in American English instead of British English, just set $LANG to en_US.

Variable Phrases

You can combine the msg() message retrieval function with printf() to store phrases that require values to be substituted into them. Consider the English sentence "I am 12 years old." In Spanish, the corresponding phrase is "Tengo 12 aņos." The Spanish phrase can be built by stitching together translations of "I am," the numeral 12, and "years old." It's easier, though, to store them in the message catalogs as printf()-style format strings:

<?php
$messages = array(
    'en_US' => array(
        'I am X years old.' =>
            'I am %d years old.'),
    'es_US' => array(
        'I am X years old.' => 
            'Tengo %d aņos.')
);
?>

You can then pass the results of msg() to printf() as a format string:

<?php
$LANG ='es_US';

printf(msg('I am X years old.'), 12);
?>

Tengo 12 aņos.  

For phrases that require the substituted values to be in a different order in different languages, printf() supports changing the order of the arguments:

<?php
$messages = array(
    'en_US' => array(
        'I am X years and Y months old.' =>
        'I am %d years and %d months old.'),
    'es_US' => array(
        'I am X years and Y months old.'=>
        'Tengo %2$d meses y %1$d aņos.')
);
?>

With either language, call sprintf() with the same order of arguments (i.e., first years, then months):

<?php
$LANG ='es_US';

printf(msg('I am X years and Y months old.'),12,7);
?>

Tengo 7 meses y 12 aņos.  

In the format string, %2$ tells printf() to use the second argument, and %1$ tells it to use the first.

Pages: 1, 2

Next Pagearrow




Valuable Online Certification Training

Online Certification for Your Career
Earn a Certificate for Professional Development from the University of Illinois Office of Continuing Education upon completion of each online certificate program.

PHP/SQL Programming Certificate — The PHP/SQL Programming Certificate series is comprised of four courses covering beginning to advanced PHP programming, beginning to advanced database programming using the SQL language, database theory, and integrated Web 2.0 programming using PHP and SQL on the Unix/Linux mySQL platform.

Enroll today!


Sponsored by: