【手册】链接文件LD

程序分段

  • .text:代码

  • .rodata:常量

  • .data:初始化全局变量

  • .bss:未初始化全局变量

  • .stack:函数调用栈帧

  • .heap:动态分配内存

ld

  • -T scriptfile
    --script=scriptfile
    Use scriptfile as the linker script.  
    This script replaces ld's default linker script (rather than adding to it), so commandfile must specify everything necessary to describe the output file.    
    If scriptfile does not exist in the current directory, "ld" looks for it in the directories specified by any preceding -L options.
    Multiple -T options accumulate.
  • -M
    --print-map
    Print a link map to the standard output.  A link map provides information about the link, including the following:
    ·   Where object files are mapped into memory.
    ·   How common symbols are allocated.
    ·   All archive members included in the link, with a mention of the symbol which caused the archive member to be brought in.
    ·   The values assigned to symbols.
    Note - symbols whose values are computed by an expression which involves a reference to a previous value of the same symbol may not have correct result displayed in the link map.  This is because the linker discards intermediate results and only retains the final value of an expression.  Under such circumstances the linker will display the final value enclosed by square brackets.  Thus for example a linker script containing:
    
            foo = 1
            foo = foo * 4
            foo = foo + 8
    
        will produce the following output in the link map if the -M option is used:
    
            0x00000001                  foo = 0x1
            [0x0000000c]                foo = (foo * 0x4)
            [0x0000000c]                foo = (foo + 0x8)
    
    See Expressions for more information about expressions in linker scripts.

scriptfile

Command Language controls:

  • input files
  • file formats
  • output file layout
  • addresses of sections
  • placement of common blocks

Linker Scripts

  • SECTIONS:specifies a “picture” of the output file’s layout
  • MEMORY:describes the available memory in the target architecture

Expressions

The syntax for expressions in the command language is identical to that of C expressions, with the following features:

  • All expressions evaluated as integers and are of “long” or “unsigned long” type.
  • All constants are integers.
  • All of the C arithmetic operators are provided.
  • You may reference, define, and create global variables.
  • You may call special purpose built-in functions.

Integers

  • octal integer: _as_octal = 0157255;
  • decimal integer: _as_decimal = 57005;
  • hexadecimal integer: _as_hex = 0xdead;
  • negative integer: _as_neg = -57005;
  • scale a constant(K and M): _fourk_1 = 4K; _fourk_2 = 4096; _fourk_3 = 0x1000;

Symbol Names

  • unquoted symbol names: start with a letter, underscore, or point and may include any letters, underscores, digits, points, and hyphens. e.g. ‘A-B’
  • quoted symbol names: e.g. “with a space”

The Location Counter

  • .:current output location counter

    SECTIONS
    {
    output :
    {
    file1(.text)        //file1放在起始位置
    . = . + 1000;
    file2(.text)        //file2放在1000
    . += 1000;
    file3(.text)        //file3放在2000
    } = 0x1234;     //文件间用0x1234填充
    }

Operators

  • standard C set of arithmetic operators, with the standard bindings and precedence levels

Evaluation

  • lazy evaluation: it only calculates an expression when absolutely necessary

Assignment: Defining Symbols

  • as commands in their own right in an ld script; (absolute address)

  • as independent statements within a SECTIONS command; (absolute address)

    SECTIONS{ ...
    .data : 
      {
        *(.data)
        _edata = ABSOLUTE(.) ;
      } 
    ... }
  • as part of the contents of a section definition in a SECTIONS command. (relative to a particular section)

Arithmetic Functions

  • ABSOLUTE(exp)

    Return the absolute (non-relocatable, as opposed to non-negative) value of the expression exp.

  • ADDR(section)

    Return the absolute address of the named section.

    SECTIONS{ ...
    .output1 :
      { 
      start_of_output_1 = ABSOLUTE(.);
      ...
      }
    .output :
      {
      symbol_1 = ADDR(.output1);
      symbol_2 = start_of_output_1;
      }
    ... }
  • ALIGN(exp)

    Return the result of the current location counter (.) aligned to the next exp boundary. Equivalent to (. + exp - 1) & ~(exp - 1)

    SECTIONS{ ...
    .data ALIGN(0x2000): {
      *(.data)
      variable = ALIGN(0x8000);
    }
    ... }
  • DEFINED(symbol)

    Return 1 if symbol is in the linker global symbol table and is defined, otherwise return 0.

  • NEXT(exp)

    Return the next unallocated address that is a multiple of exp.

  • SIZEOF(section)

    Return the size in bytes of the named section, if that section has been allocated.

    SECTIONS{ ...
    .output {
      .start = . ;
      ...
      .end = . ;
      }
    symbol_1 = .end - .start ;
    symbol_2 = SIZEOF(.output);
    ... }
  • SIZEOF_HEADERS sizeof_headers

    Return the size in bytes of the output file’s headers.

Memory Layout

  • The MEMORY command describes the location and size of blocks of memory in the target.

    MEMORY 
    {
      name (attr) : ORIGIN = origin, LENGTH = len
      ...
    }
    
    • name

    is a name used internally by the linker to refer to the region. Any symbol name may be used. The region names are stored in a separate name space, and will not conflict with symbols, file names or section names. Use distinct names to specify multiple regions.

    • (attr)

    is an optional list of attributes, permitted for compatibility with the AT&T linker but not used by ld beyond checking that the attribute list is valid. Valid attribute lists must be made up of the characters “LIRWX“. If you omit the attribute list, you may omit the parentheses around it as well.

    • origin

    is the start address of the region in physical memory. It is an expression that must evaluate to a constant before memory allocation is performed. The keyword ORIGIN may be abbreviated to org or o (but not, for example, `ORG’).

    • len

    is the size in bytes of the region (an expression). The keyword LENGTH may be abbreviated to len or l.

  • e.g.

MEMORY 
  {
  rom : ORIGIN = 0, LENGTH = 256K
  ram : org = 0x40000000, l = 4M
  }

Specifying Output Sections

  • Statements within the SECTIONS command can do one of three things:
    • define the entry point;
    • assign a value to a symbol;
    • describe the placement of a named output section, and which input sections go into it.

Section Definitions


  • SECTIONS { ...
    secname : {
    contents
    }
    ... }

Section Placement

  • filename

    You may simply name a particular input file to be placed in the current output section; all sections from that file are placed in the current section definition. If the file name has already been mentioned in another section definition, with an explicit section name list, then only those sections which have not yet been allocated are used. To specify a list of particular files by name:

    .data : { afile.o bfile.o cfile.o }

  • filename( section )

    filename( section, section, … )

    filename( section section … )

    You can name one or more sections from your input files, for insertion in the current output section. If you wish to specify a list of input-file sections inside the parentheses, you may separate the section names by either commas or whitespace.

  • * (section)

    * (section, section, …)

    * (section section …)

    Instead of explicitly naming particular input files in a link control script, you can refer to all files from the ld command line: use * instead of a particular file name before the parenthesized input-file section list. If you have already explicitly included some files by name, * refers to all remaining files–those whose places in the output file have not yet been defined. For example, to copy sections 1 through 4 from an Oasys file into the .text section of an a.out file, and sections 13 and 14 into the .data section:

    SECTIONS {
    .text :{
      *("1" "2" "3" "4")
    }
    
    .data :{
      *("13" "14")
    }
    }

    [ section ... ] used to be accepted as an alternate way to specify named sections from all unallocated input files. Because some operating systems (VMS) allow brackets in file names, that notation is no longer supported.

  • filename( COMMON )

    * ( COMMON )
    Specify where in your output file to place uninitialized data with this notation. *(COMMON) by itself refers to all uninitialized data from all input files (so far as it is not yet allocated); filename(COMMON) refers to uninitialized data from a particular file. Both are special cases of the general mechanisms for specifying where to place input-file sections: ld permits you to refer to uninitialized data as if it were in an input-file section named COMMON, regardless of the input file’s format.

Section Data Expressions

  • CREATE_OBJECT_SYMBOLS

    Create a symbol for each input file in the current section, set to the address of the first byte of data written from that input file.

  • symbol = expression ;

    symbol f= expression ;

    symbol is any symbol name. “f=” refers to any of the operators &= += -= *= /= which combine arithmetic and assignment.

  • BYTE(expression)

    SHORT(expression)

    LONG(expression)

    QUAD(expression)

    By including one of these four statements in a section definition, you can explicitly place one, two, four, or eight bytes (respectively) at the current address of that section. QUAD is only supported when using a 64 bit host or target.

  • FILL(expression)

    Specify the “fill pattern” for the current section.

Optional Section Attributes

  • SECTIONS {
    ...
    secname start BLOCK(align) (NOLOAD) : AT ( ldadr )
    { contents } >region =fill
    ...
    }
  • start
    You can force the output section to be loaded at a specified address by specifying start immediately following the section name. start can be represented as any expression.

    The following example generates section output at location 0x40000000:
    SECTIONS {

    output 0x40000000: {

    }

    }

  • BLOCK(align)

    You can include BLOCK() specification to advance the location counter . prior to the beginning of the section, so that the section will begin at the specified alignment. align is an expression.

  • (NOLOAD)
    Use (NOLOAD) to prevent a section from being loaded into memory each time it is accessed. For example, in the script sample below, the ROM segment is addressed at memory location `0’ and does not need to be loaded into each object file:
    SECTIONS {
    ROM 0 (NOLOAD) : { … }

    }

  • AT ( ldadr )
    The expression ldadr that follows the AT keyword specifies the load address of the section. The default (if you do not use the AT keyword) is to make the load address the same as the relocation address. This feature is designed to make it easy to build a ROM image.

    For example, this SECTIONS definition creates two output sections: one called .text, which starts at 0x1000, and one called .mdata, which is loaded at the end of the `.text’ section even though its relocation address is 0x2000. The symbol _data is defined with the value 0x2000:

    SECTIONS
      {
      .text 0x1000 : { *(.text) _etext = . ; }
      .mdata 0x2000 : 
          AT ( ADDR(.text) + SIZEOF ( .text ) )
          { _data = . ; *(.data); _edata = . ;  }
      .bss 0x3000 :
          { _bstart = . ;  *(.bss) *(COMMON) ; _bend = . ;}
    }

  • >region

    Assign this section to a previously defined region of memory.

  • =fill

    Including =fill in a section definition specifies the initial fill value for that section.

The Entry Point

  • entry point:
    • the -e' entry command-line option;

    • theENTRY(symbol)command in a linker control script;

    • the value of the symbolstart, if present;

    • the address of the first byte of the.textsection, if present;

    • The address0`.

Option Commands

  • CONSTRUCTORS

This command ties up C++ style constructor and destructor records. The details of the constructor representation vary from one object format to another, but usually lists of constructors and destructors appear as special sections. The CONSTRUCTORS command specifies where the linker is to place the data from these sections, relative to the rest of the linked output. Constructor data is marked by the symbol CTOR_LIST at the start, and CTOR_LIST_END at the end; destructor data is bracketed similarly, between __DTOR_LIST and __DTOR_LIST_END. (The compiler must arrange to actually run this code; GNU C++ calls constructors from a subroutine __main, which it inserts automatically into the startup code for main, and destructors from _exit.)

  • FLOAT

    NOFLOAT

These keywords were used in some older linkers to request a particular math subroutine library. ld doesn’t use the keywords, assuming instead that any necessary subroutines are in libraries specified using the general mechanisms for linking to archives; but to permit the use of scripts that were written for the older linkers, the keywords FLOAT and NOFLOAT are accepted and ignored.
FORCE_COMMON_ALLOCATION
This command has the same effect as the -d command-line option: to make ld assign space to common symbols even if a relocatable output file is specified (`-r’).

  • INPUT ( file, file, … )

    INPUT ( file file … )

Use this command to include binary input files in the link, without including them in a particular section definition. Specify the full name for each file, including .a if required. ld searches for each file through the archive-library search path, just as for files you specify on the command line. See the description of -L in @xref{Options,,Command Line Options}. If you use -lfile, ld will transform the name to libfile.a as with the command line argument `-l’.

  • GROUP ( file, file, … )

    GROUP ( file file … )

This command is like INPUT, except that the named files should all be archives, and they are searched repeatedly until no new undefined references are created. See the description of `-(’ in @xref{Options,,Command Line Options}.

  • OUTPUT ( filename )

Use this command to name the link output file filename. The effect of OUTPUT(filename) is identical to the effect of `-o filename’, which overrides it. You can use this command to supply a default output-file name other than a.out.

  • OUTPUT_ARCH ( bfdname )

Specify a particular output machine architecture, with one of the names used by the BFD back-end routines (see section BFD). This command is often unnecessary; the architecture is most often set implicitly by either the system BFD configuration or as a side effect of the OUTPUT_FORMAT command.

  • OUTPUT_FORMAT ( bfdname )

When ld is configured to support multiple object code formats, you can use this command to specify a particular output format. bfdname is one of the names used by the BFD back-end routines (see section BFD). The effect is identical to the effect of the `-oformat’ command-line option. This selection affects only the output file; the related command TARGET affects primarily input files.

  • SEARCH_DIR ( path )

Add path to the list of paths where ld looks for archive libraries. SEARCH_DIR(path) has the same effect as `-Lpath’ on the command line.

  • STARTUP ( filename )

Ensure that filename is the first input file used in the link process.

  • TARGET ( format )

When ld is configured to support multiple object code formats, you can use this command to change the input-file object code format (like the command-line option -b or its synonym -format). The argument format is one of the strings used by BFD to name binary formats. If TARGET is specified but OUTPUT_FORMAT is not, the last TARGET argument is also used as the default format for the ld output file. If you don’t use the TARGET command, ld uses the value of the environment variable GNUTARGET, if available, to select the output file format. If that variable is also absent, ld uses the default format configured for your machine in the BFD libraries.

mapfile

你可能感兴趣的:(手册,链接文件,ld)