The linux loader, and howit finds libraries: ld-linux and so on

From: http://grahamwideman.wordpress.com/2009/02/09/the-linux-loader-and-how-it-finds-libraries/

The linux loader, and howit finds libraries: ld-linux and so on

As partof an effort to understand implications of different installation procedures onlinux, I investigated how executables find shared object libraries (libxyz.so)for dynamic loading, and thus what an installer (program or human) needs toconfigure.

Overview

Most compiled programs on linux need to call shared objects —modules provided by other packages that are loaded and linked-to dynamically(ie: at run time).  On linux the loading of needed modules is performedwhen an executable is launched, by the gnu loader module, ld which relies inturn on /lib/ld-linux.so.

The way in which ld finds the requested libraries goes a long wayto explaining what a package installer must do to make thepackage work properly, and to allow other programs to find the libraries of thenew package.

ELF: Executable and LinkingFormat

Most compiled programs on Linux are compiled into a format knownas ELF. Amongst other things, ELF defines aheader for executable files, which contains attributes of the executable, someof which are important for the loading process.

The program readelf can beused to view the header of such programs or .so libraries:

  • readelf -a filename       Shows all header info
  • readelf -d filename      Shows only data from the “dynamic” section

The “dynamic” section of the header is of interest because itcontains data used during the initial loading process, such as:

  • NEEDED: libraries needed by this module
  • RPATH: See “Loader search procedure” below
  • SONAME: If this module is a library, this item shows the “soname” of the library.

(The ldd program provides informationsimilar to this list of NEEDED libraries, but also adds the libraries needed bythe NEEDED libraries, and so on.)

Loader search procedure

When a compiled program is launched on linux, its header isinspected to see what shared objects (libraries, xxx.so) it requires, and theseare loaded. Each .so itself has a similar header, which can also specify otherneeded libraries, and so on.

From man ld-linux.so:

The shared libraries needed by the program are searched for invarious places:

1.    DT_RPATH: Using theDT_RPATH dynamic section attribute of the binary if present and DT_RUNPATHattribute does not exist. Use of DT_RPATH is deprecated. (Ie: this is avalue that can be included in an executable’s ELF header.  There’sapparently some controversy over whether DT_RPATHreally overrides LD_LIBRARY_PATH — GW).

2.    LD_LIBRARY_PATH: Using theenvironment variable LD_LIBRARY_PATH. Except if the executable is aset-user-ID/set-group-ID binary, in which case it is ignored. 

3.    DT_RUNPATH: Using theDT_RUNPATH dynamic section attribute of the binary if present. (Ie: theexecutable can provide a list of paths t search for objects to load. However,DT_RUNPATH is not applied at the point those objects load other objects. — GW)

4.    /etc/ld.so.cache : From thecache file /etc/ld.so.cache which contains a compiled list of candidatelibraries previously found in the augmented library path. If, however, thebinary was linked with -z nodeflib linker option, libraries in the defaultlibrary paths are skipped.

5.    Default paths: In thedefault path /lib, and then /usr/lib. If the binary was linked with -z nodefliblinker option, this step is skipped.

Let’s elaborate on these search locations:

1.    DT_RPATH: deprecated,however seems to be used sometimes. This may be useful for a program to specifythe location of .so’s supplied within the same package and not necessarilyuseful to others. Eg: currently readelf shows that /user/bin/mysql program hasan RPATH of /usr/lib/mysql, ie: it points to a mysql-specific subdir of/usr/lib.
.

2.    LD_LIBRARY_PATH: Thismethod for finding libraries is usefully configured from a shell script thatlaunches the program proper, and hence it’s good for libraries that don’t needto be shared in general by the methods below. This seems to be used by

1.    Developers temporarily switching libraries for testing or

2.    Program suites that supply shared objects for their own use , butwhich don’t need to be shared with the rest of the system’s applications. (In some quarters LD_LIBRARY_PATH is “considered harmful”.)
.

3.    DT_RUNPATH: (Seems not usedmuch?)
.

4.    /etc/ld.so.cache : This isan important case, see below.
.

5.    Default paths: /lib(used for libraries of system packages), and then /usr/lib (possible locationfor the libraries of non-system packages).

Of these, /etc/ld.so.cache is used prominently by non-systempackages, and needs more explanation.

/etc/ld.so.cache:cross-references

ld.so.cache provides a cross-reference from ashared-object’s name to its full path.  It is used by ld-linux asone of the methods to find the actual .so’s that are required by an executablebeing loaded.  ld.so.cache can be manipulated using the /sbin/ldconfig program:

  • ldconfig -p      Lists the cross-references currently known to the cache
  • ldconfig           (no args) Re-survey directories where libraries reside, making needed file symbolic links and updating the cache. (More on this below).

For example, readelf shows that the mysql program’s header dynamicsection lists several NEEDED shared libraries, including libncurses.so.5. Then, ldconfig -p shows:

libncurses.so.5 (libc6) => /usr/lib/libncurses.so.5

… and finally ls shows that /usr/lib/libncurses.so.5 is a symboliclink to libncurses.so.5.5, which is the actual file containing the library.

Note on version numbering

The example shown here follows the pattern:libname.so.major.minor, where major and minor are major and minor versionnumbers. Libraries generally have an internal “SONAME” (as can be viewed withreadelf) that includes the major version number but excludes the minor versionnumber. The SONAME is also the name listed in the NEEDED listing to indicate aneeded library.

The logic here appears to be that the major version numberidentifies a particular API (exact suite of functions which might change frommajor version to major version, and which needs to match what the callingprogram expects), while the minor version number indicates no change in API,perhaps bug fixes or other internal improvements only.

The above scheme is not strict, for example we have libdl-2.5.so,which has an soname of libdl.so.2 (and corresponding symbolic link).

/etc/ld.so.cache, symboliclinks: Maintained by ldconfig

We just saw that ld.so.cache contains cross-references from alibrary’s SONAME to an actual file path, though that path usually leads to asymbolic link pointing to the actual .so file needed. But what creates/updatesthis set of data and links?

That’s the job of ldconfig — aprogram that can be run at any time (for example as part of an installprocess), and in most systems is set to run at every boot-up, to be on the safeside.

ldconfig input

ldconfig needs to know what directories to survey. By default itsurveys “trusted directories” /lib and /usr/lib. In addition, ldconfig consultsconfiguration file/etc/ld.so.conf, atext file which provides a list of directories to survey.

According to one convention, ld.so.conf contains an instruction:“include /etc/ld.so.cond.d/*.conf“, establishing a directoryin which new packages can place their own xyz.conf file to make their own libdirectories available.

ldconfig effect

A default (no args) invocation of ldconfig surveys the listeddirectories and reads the SONAME information from each library file. With thisinfo, ldconfig performs two main actions:

  • Link: Create (or replace) a symbolic link whose name matches soname (eg: libxyz.so.3), pointing to the actual library file (eg: libxyz.so.3.4).
  • ld.so.cache item: Create or replace an item in ld.so.cache which cross-references soname to the full path to the just-mentioned symbolic link. (eg: libxyz.so.3 –> /usr/lib/libxyz.so.3)

Assuming that all dependent programs and libraries list requiredlibraries (their NEEDED list) by their sonames, this ldconfig activity willresult in the necessary info to allow ld-linux to find them when an executableprogram is loaded.

 


你可能感兴趣的:(The linux loader, and howit finds libraries: ld-linux and so on)