对u-boot.lds的更详细的分析
Linker Script Format
Linker scripts are text files.
You write a linker script as a series of commands.Each command is either a keyword,
possibly followed by arguments,or an assignment to a symbol.You may separate commands using semicolons.Whitespace is generally ignored.
Strings such as file or format names can normally be entered directly.If the file name
contains a character such as a comma which would otherwise serve to separate file names, you may put the file name in double quotes.There is no way to use a double quote character in a file name.
You may include comments in linker scripts just as in C,delimited by‘/*’and‘*/’.As in
C,comments are syntactically equivalent to whitespace.
一个可执行img(镜像)文件必须有一个入口点,并且只能有一个全局入口点,通常这个入口点的地址放在ROM(Flash)的0x0位置,因此我们必须使编译器知道这个入口地址,而该过程是通过修改连接脚本文件来完成的。
这里,我们可以尝试着分析一下u-boot-1.1.6的链接脚本u-boot.lds。可以选择u-boot-1.1.6/board/smdk2410/目录下的链接脚本u-boot.lds进行剖析。
OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
先看看GNU官方对OUTPUT_FORMAT的解释:
Commands Dealing with Object File Formats
A couple of linker script commands deal with object file formats.
l OUTPUT_FORMAT(bfdname)
l OUTPUT_FORMAT(default,big,little)
The OUTPUT_FORMAT command names the BFD format to use for the output
file.Using OUTPUT_FORMAT(bfdname)is exactly like using‘--oformat bfdname’on the command line.If both are used,the command line option takes precedence.
You can use OUTPUT_FORMAT with three arguments to use different formats based
on the‘-EB’and‘-EL’command line options. This permits the linker script to set the output format based on the desired endianness. If neither‘-EB’nor‘-EL’are used, then the output format will be the first argument,default. If‘-EB’is used,the output format will be the second argument, big.If‘-EL’is used, the output format will be the third argument,little.
For example, the default linker script for the MIPS ELF target uses this command:
OUTPUT_FORMAT(elf32-bigmips,elf32-bigmips,elf32-littlemips)
This says that the default format for the output file is‘elf32-bigmips’, but if the user uses the‘-EL’command line option, the output file will be created in the‘elf32-littlemips’format.
注:BFD是一种特殊的库。The linker accesses object and archive files using the BFD libraries.These libraries allow the linker to use the same routines to operate on object files whatever the object file format.
l OUTPUT_FORMAT(DEFAULT,BIG,LITTLE) : 这一行的目的是指定输出目标文件的输出文件格式,一共三种,缺省是第一种DEFAULT。
l 若有命令行选项-EB, 则使用第2个BFD格式; 若有命令行选项-EL,则使用第3个BFD格式.否则默认选第一个BFD格式.
l 三个分别指定在缺省、大端、小端情况下的输出可执行文件格式,u-boot-1.1.6在这里(缺省为第一种,即elf32-littlearm)指定可执行文件输出格式是elf32,小端和arm体系结构。
OUTPUT_ARCH(arm)
先看看GNU官方对OUTPUT_ARCH的解释:
Other Linker Script Commands
OUTPUT_ARCH(bfdarch)
Specify a particular output machine architecture.The argument is one of the names used by the BFD library.You can see the architecture of an object file by using the objdump program with the‘-f’ option.
注:可通过 man -S 1 ld查看ld的联机帮助, 里面也包括了对这些命令的介绍.
l 指定输出可执行文件的平台为ARM
l OUTPUT_ARCH(BFDARCH):设置输出文件的machine architecture(体系结构),BFDARCH为被BFD库使用的名字之一。可以用命令objdump -f查看。
ENTRY(_start)
先看看GNU官方对ENTRY的解释:
Setting the Entry Point
The first instruction to execute in a program is called the entry point.You can use the
ENTRY linker script command to set the entry point.The argument is a symbol name:
ENTRY(symbol)
There are several ways to set the entry point.The linker will set the entry point by trying
each of the following methods in order,and stopping when one of them succeeds:
l the‘-e’entry command-line option;
l the ENTRY(symbol)command in a linker script;
l the value of the symbol start,if defined;
l the address of the first byte of the‘.text’section,if present;
l The address 0.
中文解释:
ENTRY(SYMBOL) : 将符号SYMBOL的值设置成入口地址。
入口地址(entry point): 进程执行的第一条用户空间的指令在进程地址空间的地址。
ld有多种方法设置进程入口地址, 按一下顺序: (编号越前, 优先级越高)
1、ld命令行的-e选项
2、连接脚本的ENTRY(SYMBOL)命令
3、如果定义了start符号, 使用start符号值
4、如果存在.text section, 使用.text section的第一字节的位置值
5、使用值0
注:ENTRY(_start) 在这里的意思是——指定启动时的函数入口地址,_start在每个CPU目录下的start.S中定义,真正的启动运行地址段在编译时在/u-boot-1.1.6/board/smdk2410/config.mk中由TEXT_BASE宏定义,即TEXT_BASE = 0x33F80000
在开始看SECTIONS之前,我们先看看官方给SECTIONS的解释和一个例子:
Simple Linker Script Example
Many linker scripts are fairly simple.
The simplest possible linker script has just one command:‘SECTIONS’.You use the
‘SECTIONS’command to describe the memory layout of the output file.
The‘SECTIONS’command is a powerful command.Here we will describe a simple use of it.
Let’s assume your program consists only of code,initialized data,and uninitialized data.
These will be in the‘.text’(代码段),‘.data’(数据段),and‘.bss’(未初始化数据段)sections,respectively.Let’s assume further that these are the only sections which appear in your input files.
For this example,let’s say that the code should be loaded at address 0x10000,and that the
data should start at address 0x8000000.Here is a linker script which will do that:
SECTIONS
{
.=0x10000;
.text:{*(.text)}
.=0x8000000;
.data:{*(.data)}
.bss:{*(.bss)}
}
You write the‘SECTIONS’command as the keyword‘SECTIONS’,followed by a series of
symbol assignments and output section descriptions enclosed in curly braces.
The first line inside the‘SECTIONS’command of the above example sets the value of the
special symbol‘.’,which is the location counter.If you do not specify the address of an
output section in some other way(other ways are described later),the address is set from
the current value of the location counter.The location counter is then incremented by the
size of the output section.At the start of the‘SECTIONS’command,the location counter
has the value‘0’.
The second line defines an output section,‘.text’.The colon is required syntax which may
be ignored for now.Within the curly braces after the output section name,you list the
names of the input sections which should be placed into this output section.The‘*’is a
wildcard which matches any file name.The expression‘*(.text)’means all‘.text’input
sections in all input files.
Since the location counter is‘0x10000’when the output section‘.text’is defined,the linker
will set the address of the‘.text’section in the output file to be‘0x10000’.
The remaining lines define the‘.data’and‘.bss’sections in the output file.The linker
will place the‘.data’output section at address‘0x8000000’.After the linker places the
‘.data’output section,the value of the location counter will be‘0x8000000’plus the size of
the‘.data’output section.The e?ect is that the linker will place the‘.bss’output section
immediately after the‘.data’output section in memory
The linker will ensure that each output section has the required alignment,by increasing
the location counter if necessary.In this example,the specified addresses for the‘.text’
and‘.data’sections will probably satisfy any alignment constraints,but the linker may
have to create a small gap between the‘.data’and‘.bss’sections.
注:下面是对上面那个例子的中文解释。
这段脚本将输出文件的text section定位在0x10000, data section定位在0x8000000:
SECTIONS
{
. = 0x10000;
.text : { *(.text) }
. = 0x8000000;
.data : { *(.data) }
.bss : { *(.bss) }
}
解释一下上述的例子:
l . = 0x10000 : 把定位器符号置为0x10000 (若不指定, 则该符号的初始值为0).
l .text : { *(.text) } : 将所有(*符号代表任意输入文件)输入文件的.text section合并成一个.text section, 该section的地址由定位器符号的值指定, 即0x10000.
l . = 0x8000000 :把定位器符号置为0x8000000
l .data : { *(.data) } : 将所有输入文件的.text section合并成一个.data section, 该section的地址被置为0x8000000.
l .bss : { *(.bss) } : 将所有输入文件的.bss section合并成一个.bss section,该section的地址被置为0x8000000+.data section的大小.
连接器每读完一个section描述后, 将定位器符号的值*增加*该section的大小(此处暂且不考虑对齐约束)。
下面开始分析SECTIONS:
SECTIONS
{
. = 0x00000000;
l 这里的点”.”,是定位器符号(GNU风格的一个典型)。
l 把定位器符号置为0x00000000 (若不指定, 则该符号的初始值为0)。
l 定系统启动从偏移地址零处开始。注意这只是个代码地址偏移值,真正的起始地址是由编译时指定的CFLAGS指定的。
. = ALIGN(4);
l 4字节对齐调整, 那么ALIGN(0x10) 即16字节对齐后
再看看官方给的解释:
ALIGN(exp)
Return the location counter(.)aligned to the next exp boundary.ALIGN
doesn’t change the value of the location counter—it just does arithmetic on it.
Here is an example which aligns the output.data section to the next 0x2000
byte boundary after the preceding section and sets a variable within the section
to the next 0x8000 boundary after the input sections:
SECTIONS{...
.data ALIGN(0x2000):{
*(.data)
variable=ALIGN(0x8000);
}
...}
The first use of ALIGN in this example specifies the location of a section be-
cause it is used as the optional address attribute of a section definition(see
Section 3.6.3[Output Section Address],page 37).The second use of ALIGN is
used to defines the value of a symbol.
The builtin function(内嵌函数) NEXT is closely related to ALIGN.
NEXT(exp)
Return the next unallocated address that is a multiple of exp.This function is
closely related to ALIGN(exp);unless you use the MEMORY command to define
discontinuous memory for the output file,the two functions are equivalent.
对字节对齐的进一步讲解,可以看看这篇博客:
http://www.yuanma.org/data/2006/0723/article_1213.htm
.text :
{
cpu/arm920t/start.o (.text) /*.text段空间 */
*(.text) /*后续.text段内容的分配*/
}
这段脚本的意思是将所有输入文件的.text section,以及cpu/arm920t/start.o合并成一个.text section,该section的地址由定位器符号的值指定(字节对齐后定位器符号的值)。
. = ALIGN(4);
.rodata : { *(.rodata) } /*.rodata只读数据段*/
这段脚本的意思是先进行4字节对齐,然后将所有输入文件的.rodata section,合并成一个.rodata section,该section的地址由定位器符号的值指定(字节对齐后定位器符号的值)。
. = ALIGN(4);
.data : { *(.data) } /* .data可读可写数据段 */
按照上面的解释,这段应该自己去理解!
. = ALIGN(4);
.got : { *(.got) } /*.got段是uboot自定义的一个段,不是GNU官方定义的标准段 */
. = .; //这里没有搞清楚为什么要这样做!
__u_boot_cmd_start = .;
/*把当前位置赋值给__u_boot_cmd_start,即定义了.u_boot_cmd段空间的开始位置 */
.u_boot_cmd : { *(.u_boot_cmd) }
__u_boot_cmd_end = .;
/*把当前位置赋值给__u_boot_cmd_end,即定义了.u_boot_cmd段空间的结束位置
/*
armboot_end_data = .; ;armboot_end_data符号指向之前所有分配完段的结束
*/
. = ALIGN(4);
__bss_start = .;
/* .bss段开始位置 */
.bss : { *(.bss) }
_end = .;
/* .bss段结束位置 */
}
最后附上官方对location counter的解释:
The Location Counter
The special linker variable dot‘.’always contains the current output location counter.Since
the.always refers to a location in an output section,it may only appear in an expression
within a SECTIONS command.The.symbol may appear anywhere that an ordinary symbol
is allowed in an expression.
Assigning a value to.will cause the location counter to be moved.This may be used to
create holes in the output section.The location counter may never be moved backwards.
SECTIONS
{
output:
{
file1(.text)
.=.+1000;
file2(.text)
.+=1000;
file3(.text)
}=0x12345678;
}
In the previous example,the‘.text’section from‘file1’is located at the beginning of the
output section‘output’.It is followed by a 1000 byte gap.Then the‘.text’section from
‘file2’appears,also with a 1000 byte gap following before the‘.text’section from‘file3’.
The notation‘=0x12345678’specifies what data to write in the gaps(see Section 3.6.8.5
[Output Section Fill],page 45).
Note:.actually refers to the byte o?set from the start of the current containing object.
Normally this is the SECTIONS statement,whose start address is 0,hence.can be used as
an absolute address.If.is used inside a section description however,it refers to the byte
o?set from the start of that section,not an absolute address.Thus in a script like this: Using LD,the GNU linker
SECTIONS
{
.=0x100
.text:{
*(.text)
.=0x200
}
.=0x500
.data:{
*(.data)
.+=0x600
}
}
The‘.text’section will be assigned a starting address of 0x100 and a size of exactly 0x200
bytes,even if there is not enough data in the‘.text’input sections to fill this area.(If
there is too much data,an error will be produced because this would be an attempt to move
.backwards).The‘.data’section will start at 0x500 and it will have an extra 0x600 bytes
worth of space after the end of the values from the‘.data’input sections and before the
end of the‘.data’output section itself.