GNU/Linux上程序的国际化和本地化(I18N & I10N)

 

[目录 ]
0. forward
1. i18n and l10n introduction
2. gettext and intltool introduction
3. building a i18n program step by step
4. using autotools + intltool
 

[正文 ]

  1. froward
软件编写和文档通常使用英文,这便于世界各地的程序员交流,但是并非所有用户都懂英文,所以程序能显示为用户的母语,能使软件用户群更大,且方便了普通用户。
过去我们讲“汉化”,在有了“ Internationalization(国际化 )”和“ localization(本地化 )”的概念和技术之后,“汉化”这个词要进入博物馆了 :-)
本文介绍 GNU工程中重要的工具 gettextintltool

  1. i18n and l10n introduction

首先介绍两个术语 I18NL10N,经常也写成小写 i18nl10n
I18N/L10N 是什么?
开发人员把 internationalization简写成 I18N,中间的数字 18是前后两个字母 in之间的字母个数。 L10N依据“ localization” 使用同样的命名规则。 I18N/L10N方法、协议和应用结合在一起,允许用户使用他们自己所选择的语言。

I18N应用程序使用 I18N工具来编程。它允许开发人员写一个简单的文件,就可以将显示的菜单和文本翻译成本地语言。我们非常鼓励程序员遵循这种规则。

为什么要使用 I18N/L10N?
I18N/L10N标准能够很好地支持您查看、输入或处理非英语语言。

I18N支持哪些语言?
I18NL10N不是 Linux特有的。当前,它能支持世界上绝大部分主力语言,包括但不限于:中文,德文,日文,朝鲜文,法文,俄文,越南文等等。

本地化设置需要具备三个条件:
a. 语言代码 (Language Code)
b. 国家代码 (Country Code)
c. 编码 (Encoding)
本地名字可以用下面这些部分来构造:
语言代码 _国家代码 .编码 比如( zh_CN.UTF-8, en_US等)

locale的别名表见 /usr/lib/X11/locale/locale.alias(以Debian GNU/Linux为例)

  1. gettext and intltool introduction and l10nize this program by gettext manually with autotool

gettext GNU Translation Project上的重要一步,从它基础上可以构建其他步骤。 gettext提供了一个帮助产生多语言 message的框架:包括一组关于程序改如何编写以支持 message种类的约定, message种类相关的一个目录和文件命名组织,一个支持获取已翻译 message的运行时库,一些独立的程序(可以在终端运行如下命令查看 dpkg -L gettext | grep "/usr/bin/" | cut -d'/' -f4)。

intltool包含一些方便的命令,可以通过 dpkg -L intltool | grep “/usr/bin/” | cut -d'/' -f4查看。

  1. building a i18n program step by step

描述:编写一个提供 zh_CN 支持的程序。
 
步骤:
1) 编写文件 hello.c
#include
#include
#include
#include

#define _(STRING) gettext(STRING)

#define PACKAGE_NAME "hello"
#define PACKAGE_VERSION "0.1.0"

// NOTE: PACKAGE_NAME.mo resides in ${LOCALEDIR}/zh_CN/LC_MESSAGES/
#define LOCALEDIR "./"

int main()
{
setlocale(LC_ALL, ""); // comes from locale.h
 
bindtextdomain(PACKAGE_NAME, LOCALEDIR); // comes from
 
textdomain(PACKAGE_NAME); // comes from
 
printf( _("Hello, World!/n") );
// printf(“Thanks!/n”);
return 0;
}

  1. 1.1 )编译 $gcc -o hello hello.c
运行 $ ./hello
Hello, World!

  1. 2 )提取翻译字符串
$xgettext -d hello –k_ -s -o hello.pot hello.c
-d 表示 domain
-s 表示排序
-k_ 指导 xgettext 搜寻可翻译字符串(前导下划线_),同事仍然搜索默认的 gettext gettext_noop
上面的命令产生文件 hello.pot

3 )产生 zh_CN.po
方法一 :
$cp hello.pot zh_CN.po ,然后修改 zh_CN.po 里面的内容。方法见 参考 a.
方法二:
$ msginit -l zh_CN -o zh_CN.po -i hello.pot, 然后翻译其中的字符为中文。保存为 UTF-8 编码。

4 )编译为二进制格式
   

















    
    
       
4. 通常我们使用更高层的脚本来实现gettext完成的事情,intltool就是这样的工具。
这里有一个入门文挡,来自intltool软件包,很简单。关于automake和autoconf我会在另外一篇blog中详细提到。
---------------
 
intltool README
http://intltool.freedesktop.org
 
The intltool collection can be used to do these things:
 
 o Extract translatable strings from various source files (.xml.in, .glade, .desktop.in, .server.in, .oaf.in).
 
 o Collect the extracted strings together with messages from traditional source files (.c, .h) in po/$(PACKAGE).pot.
 
 o Merge back the translations from .po files into .xml, .desktop and  .oaf files. This merge step will happen at build resp. installation time.
 
The intltool package has a script, intltoolize, which copies the various scripts and does the other magic to your module. So users building from tarballs don't need intltool, only folks building from cvs.
(This is modelled on gettextize.)
 
How to Use with autoconf/automake
---------------------------------
(There is a section for non-auto* configurations below)
To use intltool in your module, do the following:
 o Install intltool, and make sure that the macro it installs is in aclocal's path, or do:
 
    export ACLOCAL_FLAGS='-I /usr/local/share/aclocal'
 
 o Add these lines to autogen.sh, after the call to gettextize:
 
    echo "Running intltoolize"
    intltoolize --copy --force --automake
 
 o Add this line to configure.in near the top
 
    AC_PROG_INTLTOOL([minimum required version], [no-xml])
 
 o Add intltool-extract.in, intltool-merge.in, and intltool-update.in to EXTRA_DIST in your top-level Makefile.am and also to the top-level .cvsignore. Also add the non-.in versions to .cvsignore. 
 
 o Remove po/desk.pl and po/update.* scripts. intltool-update will take over their functionality.
 
At this point, translatable strings will be automatically extracted to the .po files, if you make use of the following recommendations.
 
The intltool-prepare script will help you to prepare the package. It will try to extract translations from existing .desktop files which will become obsolete after intltoolization has taken place.
 
Examples of packages that use intltool are listed in the USED file.
 
Details of the AC_PROG_INTLTOOL macro
-------------------------------------------
 
The first parameter indicates the minimum required version. The configure script will halt if the version is older than the first parameter.
 
The second parameter is to tell intltool that we don't need the extended xml parsing abilities provided by the XML::Parser perl module. If it is not provided, or is any value other than "no-xml", then XML::Parser will be checked for by the configure script. This feature is only available in intltool 0.31 or newer.
 
Extra Steps for DESKTOP Files
..............................
 
This step also applies for similar files (.directory, .soundlist).
 
 o Try to run intltool-prepare.
 
 o Make sure intltool-prepare did find existing translations in the old .desktop files and did correctly merge them into the various po/*.po files. Don't forget to commit the changed .po files; otherwise exiting translations will get lost!
 
 o Remove old .desktop files and add new .desktop.in files.
 
 o Adjust .cvsignore
 
 o Adjust Makefile.am, e.g.:
 
    --- start ----
 
        utilsdir = $(datadir)/gnome/apps/Utilities
        utils_in_files = bug-buddy.desktop.in
        utils_DATA = $(utils_in_files:.desktop.in=.desktop)
        @INTLTOOL_DESKTOP_RULE@
 
    --- end ----
 
 o Add .desktop.in files to po/POTFILES.in
 
Here's a .desktop.in example:
 
    --- start ----
 
        [Desktop Entry]
        _Name=Bug Report Tool
        _Comment=Report a bug in GNOME
        Exec=bug-buddy
        Icon=bug-buddy.png
        Terminal=0
        Type=Application
 
    --- end ----
 
Extra Steps for GLADE Files
...........................
 
 o Add the .glade files you want translated to POTFILES.in
 
 o Remove the intermediate *-glade.h or strings-glade.c files and drop them from POTFILES.in
 
 
Extra Steps for SERVER Files (formerly .oafinfo or .oaf)
.............................
 
To get server translation extraction and merging requires a few more steps:
 
 o Rename your .oafinfo (or .oaf) files to .oaf.in or .server.in and put an underscore before every value property for string attributes that should be localized.
 
 o Add the new .oaf.in or .server.in files to POTFILES.in.
 
 o Put lines like these in every Makefile.am that installs oaf files:
 
    --- start ----
 
    oafdir = $(datadir)/oaf
 
    oaf_in_files = My_OAF_info_file.oaf.in
    oaf_DATA = $(oaf_in_files:.oaf.in=.oaf)
 
    @INTLTOOL_OAF_RULE@
 
    EXTRA_DIST=$(oaf_in_files) $(oaf_DATA)
 
    --- end ----
 
At this point, your oaf translations will be extracted and merged. Also, so long as you are renaming the .oafinfo files to .oaf.in, you should take the opportunity to rename them to the new base naming convention, with namespacing, for example:
 
    foo.oafinfo --> GNOME_Foo.oaf.in
    foo-baa.oafinfo --> GNOME_Foo_baa.oaf.in
 
 
Extra Steps for XML Files (Files with .xml Extension)
.....................................................
 
To get xml (files with .xml extension) translation extraction and merging requires these steps:
 
 o Rename your .xml files to .xml.in and put an underscore before every element that should be localized.
 
 o Add the .xml.in files to POTFILES.in.
 
 o Put lines like these in every Makefile.am that installs xml files:
 
        --- start ----
 
        xmldir = $(datadir)/xml
 
        xml_in_files = My_xml_file.xml.in
        xml_DATA = $(xml_in_files:.xml.in=.xml)
 
        @INTLTOOL_XML_RULE@
 
        EXTRA_DIST=$(xml_in_files) $(xml_DATA)
 
        --- end ----
 
At this point, your xml translations will be extracted and merged. All .po files will be converted on the fly to UTF-8, and the resulting XML file will have a UTF-8 effective encoding (you should make sure that the encoding="..." declaration in the .xml.in file is either absent or actually specifies UTF-8).
 
Previous versions of intltool generated XML files whose contents were made of the contents of the .po files, without paying attention to the encodings used. A single "XML" file could thus have strings in different encodings. This broken behavior can be requested only by using the old xml-i18n-tools API instead of the intltool one. See old versions of xml-i18n-tools for documentation on how the old API worked.
 
---
 
XXX: add section for KEYS files. Works almost like XML files .
 
How to use without autoconf/automake
------------------------------------
 
intltool can also be used without the auto* tools. For instance in order to translate a somename.desktop.in file, you can do the following.
 
 o Create a po/ dir.
 o Add a po/POTFILES.in file, including the path to the somename.desktop.in file
 
Then to create the somename.desktop file all you do is:
 
$ intltool-merge po/ -d -u -c po/.intltool-merge-cache somename.desktop.in somename.desktop
 
You can also type intltool-merge --help for a bit more info.
 
To specify parameters for intltool-update (such as keywords or gettext domain), you can use Makevars syntax as used in recent GNU gettext, by putting something like the following in po/Makevars file:
 
 DOMAIN = mydomain
 XGETTEXT_OPTIONS = --keyword --keyword=blah
 
This will make "intltool-update -p" produce mydomain.pot, passing parameters "--keyword --keyword=blah" to xgettext when extracting strings.
 
Passing special parameters to xgettext via environment
......................................................
 
If you need to add parameters passed to xgettext on a case-by-case basis, you can do so using environment variable XGETTEXT_ARGS.
 
If you would run it as follows:
 
    XGETTEXT_ARGS=--no-location intltool-update -p
 
You would create a PO Template file without lines which indicate location of messages in the source code.
 
 
Changing keywords used in xgettext invocation
.............................................
 
If you need to change default keywords used to extract messages from source code, you need to add variable XGETTEXT_KEYWORDS to Makefile.in.in file inside directory where intltool-update is run from, eg.
 
        --- start ----
 
        XGETTEXT_KEYWORDS = --keyword --keyword=P_
 
        --- end ----
 
Default keywords xgettext looks for if no XGETTEXT_KEYWORDS is defined are _, N_ and U_.
 
 
Translators' comments in XML and .schemas files
...............................................
 
To provide comments to translators in free-form XML or .schema files, you need to precede the string to be translated with the plain XML comment. If comments contain character ">", they will be ignored (this is implementation issue, should be fixed in the future).
 
In .schemas files, comments need to be inside , or elements (i.e. they cannot be before the opening tag).
 
 
这个文挡一步步地讲得很清楚,可以照着试验一下。
Autoconf/I18n-ify HelloWorld HOW-TO
 
In this article we are going to explain how to turn a simple Hello World application with a standard Makefile into an autotools-and I18N-enabled tree up to the point where it can be distributed.
 
Our existing helloworld.c file looks like the following:
 
#include
 
int main (void) {
 printf ("Hello, world!/n");
}
 
1. First we create a source tree :
   /                        - This is the top level directory
   /src/                    - Here the source will end up.
 
   and place the helloworld.c file in the src/ dir
 
2. If your program has not been autoconf-enabled yet, you can create configure.scan (which is a good starting point for configure.ac) and rename it to configure.ac
 
    autoscan   # creates configure.scan
    mv configure.scan configure.ac
 
   Now edit configure.ac and make some changes. You can remove everything after AC_INIT, we'll be using AM_INIT_AUTOMAKE to pass on variables.
 
   Add the lines
     PACKAGE=helloworld
     VERSION=0.0.1
     AM_INIT_AUTOMAKE($PACKAGE, $VERSION)
   to configure.in, just after AC_INIT
 
   Change AC_CONFIG_HEADER to AM_CONFIG_HEADER as well.
 
   If you have an empty AC_CONFIG_FILES macro, then comment that, or automake    will fail in the next step.
 
   Finally, add Makefile to the AC_OUTPUT macro by changing that line to read
     AC_OUTPUT(Makefile)
 
   NOTE: configure.ac used to be called configure.in
 
3. We add some files that automake does not make but are necessary to adhere to GNU standards.
 
     touch NEWS README AUTHORS ChangeLog
 
   These two files need to be created to satisfy automake
 
    touch config.h.in Makefile.am
 
   We will create Makefile.am later on.
 
4. To add some basic files (like COPYING, INSTALL, etc..)
   we run automake in the toplevel directory.
 
    automake --add-missing --gnu
 
5. After that we do the big i18n trick :-), also in the toplevel directory.
 
   gettextize --force --copy    # created po/ dir with some files
   intltoolize                 # bring in the perl helper scripts
 
6. Run autoheader which will create config.h.in
 
    autoheader # create config.h.in
 
7. Now, open up configure.in and make some modifications.
 
    The gettext macros need to be added after the initial checks. Putting them after the checks for library functions is a good idea.
 
    AC_PROG_INTLTOOL(0.26)
    AM_GNU_GETTEXT([external])
    ALL_LINGUAS="da nl"                 # Internationalization, means there is
                    # a .po file for danish and dutch.
    AM_GLIB_GNU_GETTEXT
 
    AC_OUTPUT(
    Makefile
    src/Makefile
    intl/Makefile
    po/Makefile.in
    )
 
    AC_PROG_INTLTOOL checks if a good enough intltool is available. Please require the latest intltool that exists. Intltool releases are pretty stable and often only contains bugfixes.
 
    AM_GNU_GETTEXT adds native language support to automake, together with a compile option.
 
    AM_GNU_GETTEXT will check for additional required functions and programs and will finally create po/POTFILES during configure.
 
    Instead of using AM_GLIB_GNU_GETTEXT you can do the following:
 
    [sed -e "/POTFILES =/r po/POTFILES" po/Makefile.in > po/Makefile]
 
    The text domain is identified by PACKAGE. We will need to add a few functions later on to helloworld.c that will use this #define'd variable.
 
    Also, this will be the base filename for all your translation files, so make sure you choose a unique one.
 
8.
    Now add the add the supported languages to po/LINGUAS:
 
    da nl
 
    NOTE: These used to be in configure.{in,ac} in the ALL_LINGUAS variable. This is deprecated since gettext 0.11
 
9. Run
       aclocal
     to make sure that the necessary autoconf and automake macros are inserted in aclocal.m4
 
     Run
       autoconf
     to create the configure script.
 
10. install the gettext.h file (since gettext 0.11) and include it:
 
    #include "gettext.h"
    #define _(String) gettext (String)
 
11. Now add the following to helloworld.c
 
    #include
    #include "gettext.h"
    #define _(String) gettext (String)
    /* includes used by original program here */   
 
    int main (void)
    {
 
        setlocale (LC_ALL, "");
            bindtextdomain (PACKAGE, LOCALEDIR);
            textdomain (PACKAGE);
 
            /* Original Helloworld code here */
    }
 
    If you use GNOME og GTK+ the setlocale sentence shouldn't be needed
 
    We also substitute all strings we want to be translated with _("original string") to make sure that gettext is run on the strings. So the printf now looks like
 
      printf (_("Hello, world!/n"));
 
12. We create src/Makefile.am (from which Makefile.in and Makefile will be generated)
 
    INCLUDES = -I$(top_srcdir) -I$(includedir) /
               -DLOCALEDIR=/""$(datadir)/locale"/"
 
    bin_PROGRAMS = helloworld
 
    helloworld_SOURCES = helloworld.c
    noinst_HEADERS = i18n-support.h
 
13. Now we create the following toplevel Makefile.am
 
     SUBDIRS = src po
 
     EXTRA_DIST = intltool-extract.in intltool-merge.in intltool-update.in
 
14. Go into the directory po/ and create POTFILES.in
    This file should contain a list of all the files in your distribution (starting from the top, one level above the po dir) that contain strings to be internationalized.
 
    For the helloworld sample, it would contain
    src/helloworld.c
 
    Run
      intltool-update --pot
 
    Run
      intltool-update --maintain
    to see if you are missing files that contain marked strings. You should consider adding these to POTFILES.in
 
15. Now we start making a Danish and Dutch translation
 
    msginit --locale=da
    msginit --locale=nl
 
    intltool-update da
    intltool-update nl
 
    edit and update da.po and nl.po (The respective translations are "Hej verden" and "Hallo wereld")
16. Now we can compile. We will test it later, so we will install it in
    a temporary location.
    Close your eyes and type
      ./configure --prefix=/tmp/helloworld && make
    in the toplevel directory. :-)
17. To test if it works, you have to install the package.
    Run
      make install
    in the toplevel directory.
18. Now set the environment variable LC_ALL to your preferred language :
      export LC_ALL=nl_NL
      /tmp/helloworld/bin/helloworld
      export LC_ALL=da_DK
      /tmp/helloworld/bin/helloworld
 
    And if all goes well, the string should be translated in the two languages.
19. To finish it all up, run
      make dist
    to create a distributable tarball containing your internationalized program.
20. Exercises :
    - add another language
---------------


[参考 ]
a. 翻译 .po 流程 - http://i18n.linux.net.cn/method.php
b. 于明俭 - Linux 国际化本地化和中文化
The Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Edition (Chap.7 Locale)
c. Internationalising GNOME applications (http://www.gnome.org/~malcolm/i18n/)

你可能感兴趣的:(程序设计语言,GNU/Linux平台)