启动 |
最初,Linux核心的最开始部分是用8086汇编语言编写的。当开始运行时,核心将自己装入到绝对地址0x90000,再将其后的2k字节装入到地址0x90200处,最后将核心的其余部分装入到0x10000。
当系统装入时,会显示Loading...信息。装入完成后,控制转向另一个实模式下的汇编语言代码boot/Setup.S。Setup部分首先设置一些系统的硬件设备,然后将核心从0x10000处移至0x1000处。这时系统转入保护模式,开始执行位于0x1000处的代码。
接下来是内核的解压缩。0x1000处的代码来自于文件Boot/head.S,它用来初始化寄存器和调用decompress_kernel( )程序。decompress_kernel( )程序由Boot/inflate.c, Boot/unzip.c 和Boot/misc.c组成。解压缩后的数据被装入到了0x100000处,这也是Linux不能在内存小于2M的环境下运行的主要原因。
解压后的代码在0x1010000处开始执行,紧接着所有的32位的设置都将完成: IDT、GDT和LDT将被装入,处理器初始化完毕,设置好内存页面,最终调用start_kernel过程。这大概是整个内核中最为复杂的部分。
[系统开始运行]
Linux kernel 最早的C代码从汇编标记startup_32开始执行
|startup_32:
|start_kernel
|lock_kernel
|trap_init
|init_IRQ
|sched_init
|softirq_init
|time_init
|console_init
|#ifdef CONFIG_MODULES
|init_modules
|#endif
|kmem_cache_init
|sti
|calibrate_delay
|mem_init
|kmem_cache_sizes_init
|pgtable_cache_init
|fork_init
|proc_caches_init
|vfs_caches_init
|buffer_init
|page_cache_init
|signals_init
|#ifdef CONFIG_PROC_FS
|proc_root_init
|#endif
|#if defined(CONFIG_SYSVIPC)
|ipc_init
|#endif
|check_bugs
|smp_init
|rest_init
|kernel_thread
|unlock_kernel
|cpu_idle
·startup_32 [arch/i386/kernel/head.S]
·start_kernel [init/main.c]
·lock_kernel [include/asm/smplock.h]
·trap_init [arch/i386/kernel/traps.c]
·init_IRQ [arch/i386/kernel/i8259.c]
·sched_init [kernel/sched.c]
·softirq_init [kernel/softirq.c]
·time_init [arch/i386/kernel/time.c]
·console_init [drivers/char/tty_io.c]
·init_modules [kernel/module.c]
·kmem_cache_init [mm/slab.c]
·sti [include/asm/system.h]
·calibrate_delay [init/main.c]
·mem_init [arch/i386/mm/init.c]
·kmem_cache_sizes_init [mm/slab.c]
·pgtable_cache_init [arch/i386/mm/init.c]
·fork_init [kernel/fork.c]
·proc_caches_init
·vfs_caches_init [fs/dcache.c]
·buffer_init [fs/buffer.c]
·page_cache_init [mm/filemap.c]
·signals_init [kernel/signal.c]
·proc_root_init [fs/proc/root.c]
·ipc_init [ipc/util.c]
·check_bugs [include/asm/bugs.h]
·smp_init [init/main.c]
·rest_init
·kernel_thread [arch/i386/kernel/process.c]
·unlock_kernel [include/asm/smplock.h]
·cpu_idle [arch/i386/kernel/process.c]
start_kernel( )程序用于初始化系统内核的各个部分,包括:
*设置内存边界,调用paging_init( )初始化内存页面。
*初始化陷阱,中断通道和调度。
*对命令行进行语法分析。
*初始化设备驱动程序和磁盘缓冲区。
*校对延迟循环。
最后的function'rest_init' 作了以下工作:
·开辟内核线程'init'
·调用unlock_kernel
·建立内核运行的cpu_idle环, 如果没有调度,就一直死循环
实际上start_kernel永远不能终止.它会无穷地循环执行cpu_idle.
最后,系统核心转向move_to_user_mode( ),以便创建初始化进程(init)。此后,进程0开始进入无限循环。
初始化进程开始执行/etc/init、/bin/init 或/sbin /init中的一个之后,系统内核就不再对程序进行直接控制了。之后系统内核的作用主要是给进程提供系统调用,以及提供异步中断事件的处理。多任务机制已经建立起来,并开始处理多个用户的登录和fork( )创建的进程。
[init]
init是第一个进程,或者说内核线程
|init
|lock_kernel
|do_basic_setup
|mtrr_init
|sysctl_init
|pci_init
|sock_init
|start_context_thread
|do_init_calls
|(*call())-> kswapd_init
|prepare_namespace
|free_initmem
|unlock_kernel
|execve
涉及的文件
./arch/$ARCH/boot/bootsect.s
./arch/$ARCH/boot/setup.s
bootsect.S
这个程序是linux kernel的第一个程序,包括了linux自己的bootstrap程序,
但是在说明这个程序前,必须先说明一般IBM PC开机时的动作(此处的开机是指
"打开PC的电源"):
一般PC在电源一开时,是由内存中地址FFFF:0000开始执行(这个地址一定
在ROM BIOS中,ROM BIOS一般是在FEOOOh到FFFFFh中),而此处的内容则是一个
jump指令,jump到另一个位於ROM BIOS中的位置,开始执行一系列的动作,包
括了检查RAM,keyboard,显示器,软硬磁盘等等,这些动作是由系统测试代码
(system test code)来执行的,随着制作BIOS厂商的不同而会有些许差异,但都
是大同小异,读者可自行观察自家机器开机时,萤幕上所显示的检查讯息。
紧接着系统测试码之后,控制权会转移给ROM中的启动程序
(ROM bootstrap routine),这个程序会将磁盘上的第零轨第零扇区读入
内存中(这就是一般所谓的boot sector,如果你曾接触过电脑病
毒,就大概听过它的大名),至於被读到内存的哪里呢? --绝对
位置07C0:0000(即07C00h处),这是IBM系列PC的特性。而位在linux开机
磁盘的boot sector上的正是linux的bootsect程序,也就是说,bootsect是
第一个被读入内存中并执行的程序。现在,我们可以开始来
看看到底bootsect做了什么。
第一步
首先,bootsect将它"自己"从被ROM BIOS载入的绝对地址0x7C00处搬到
0x90000处,然后利用一个jmpi(jump indirectly)的指令,跳到新位置的
jmpi的下一行去执行,
第二步
接着,将其他segment registers包括DS,ES,SS都指向0x9000这个位置,
与CS看齐。另外将SP及DX指向一任意位移地址( offset ),这个地址等一下
会用来存放磁盘参数表(disk para- meter table )
第三步
接着利用BIOS中断服务int 13h的第0号功能,重置磁盘控制器,使得刚才
的设定发挥功能。
第四步
完成重置磁盘控制器之后,bootsect就从磁盘上读入紧邻着bootsect的setup
程序,也就是setup.S,此读入动作是利用BIOS中断服务int 13h的第2号功能。
setup的image将会读入至程序所指定的内存绝对地址0x90200处,也就是在内存
中紧邻着bootsect 所在的位置。待setup的image读入内存后,利用BIOS中断服
务int 13h的第8号功能读取目前磁盘的参数。
第五步
再来,就要读入真正linux的kernel了,也就是你可以在linux的根目录下看
到的"vmlinuz" 。在读入前,将会先呼叫BIOS中断服务int 10h 的第3号功能,
读取游标位置,之后再呼叫BIOS 中断服务int 10h的第13h号功能,在萤幕上输
出字串"Loading",这个字串在boot linux时都会首先被看到,相信大家应该觉
得很眼熟吧。
第六步
接下来做的事是检查root device,之后就仿照一开始的方法,利用indirect
jump 跳至刚刚已读入的setup部份
第七步
setup.S完成在实模式下版本检查,并将硬盘,鼠标,内存参数写入到 INITSEG
中,并负责进入保护模式。
第八步
操作系统的初始化。
发发信人: seis (矛), 信区: Linux
标 题: Linux操作系统内核引导程序详细剖析
发信站: BBS 水木清华站 (Fri Feb 2 14:12:43 2001)
! bootsect.s (c) 1991, 1992 Linus Torvalds 版权所有
! Drew Eckhardt修改过
! Bruce Evans (bde)修改过
!
! bootsect.s 被bios-启动子程序加载至0x7c00 (31k)处,并将自己
! 移到了地址0x90000 (576k)处,并跳转至那里。
!
! bde - 不能盲目地跳转,有些系统可能只有512k的低
! 内存。使用中断0x12来获得(系统的)最高内存、等。
!
! 它然后使用BIOS中断将setup直接加载到自己的后面(0x90200)(576.5k),
! 并将系统加载到地址0x10000处。
!
! 注意! 目前的内核系统最大长度限制为(8*65536-4096)(508k)字节长,即使是在
! 将来这也是没有问题的。我想让它保持简单明了。这样508k的最大内核长度应该
! 是足够了,尤其是这里没有象minix中一样包含缓冲区高速缓冲(而且尤其是现在
! 内核是压缩的 :-)
!
! 加载程序已经做的尽量地简单了,所以持续的读出错将导致死循环。只能手工重启。
! 只要可能,通过一次取得整个磁道,加载过程可以做的很快的。
#include /* 为取得CONFIG_ROOT_RDONLY参数 */
!! config.h中(即autoconf.h中)没有CONFIG_ROOT_RDONLY定义!!!?
#include
.text
SETUPSECS = 4 ! 默认的setup程序扇区数(setup-sectors)的默认值;
BOOTSEG = 0x7C0 ! bootsect的原始地址;
INITSEG = DEF_INITSEG ! 将bootsect程序移到这个段处(0x9000) - 避开;
SETUPSEG = DEF_SETUPSEG ! 设置程序(setup)从这里开始(0x9020);
SYSSEG = DEF_SYSSEG ! 系统加载至0x1000(65536)(64k)段处;
SYSSIZE = DEF_SYSSIZE ! 系统的大小(0x7F00): 要加载的16字节为一节的数;
!! 以上4个DEF_参数定义在boot.h中:
!! DEF_INITSEG 0x9000
!! DEF_SYSSEG 0x1000
!! DEF_SETUPSEG 0x9020
!! DEF_SYSSIZE 0x7F00 (=32512=31.75k)*16=508k
! ROOT_DEV & SWAP_DEV 现在是由"build"中编制的;
ROOT_DEV = 0
SWAP_DEV = 0
#ifndef SVGA_MODE
#define SVGA_MODE ASK_VGA
#endif
#ifndef RAMDISK
#define RAMDISK 0
#endif
#ifndef CONFIG_ROOT_RDONLY
#define CONFIG_ROOT_RDONLY 1
#endif
! ld86 需要一个入口标识符,这和通常的一样;
.globl _main
_main:
#if 0 /* 调试程序的异常分支,除非BIOS古怪(比如老的HP机)否则是无害的 */
int 3
#endif
mov ax,#BOOTSEG !! 将ds段寄存器置为0x7C0;
mov ds,ax
mov ax,#INITSEG !! 将es段寄存器置为0x9000;
mov es,ax
mov cx,#256 !! 将cx计数器置为256(要移动256个字, 512字节);
sub si,si !! 源地址 ds:si=0x07C0:0x0000;
sub di,di !! 目的地址es:di=0x9000:0x0000;
cld !! 清方向标志;
rep !! 将这段程序从0x7C0:0(31k)移至0x9000:0(576k)处;
movsw !! 共256个字(512字节)(0x200长);
jmpi go,INITSEG !! 间接跳转至移动后的本程序go处;
! ax和es现在已经含有INITSEG的值(0x9000);
go: mov di,#0x4000-12 ! 0x4000(16k)是>=bootsect + setup 的长度 +
! + 堆栈的长度 的任意的值;
! 12 是磁盘参数块的大小 es:di=0x94000-12=592k-12;
! bde - 将0xff00改成了0x4000以从0x6400处使用调试程序(bde)。如果
! 我们检测过最高内存的话就不用担心这事了,还有,我的BIOS可以被配置为将wini驱动
表
! 放在内存高端而不是放在向量表中。老式的堆栈区可能会搞乱驱动表;
mov ds,ax ! 置ds数据段为0x9000;
mov ss,ax ! 置堆栈段为0x9000;
mov sp,di ! 置堆栈指针INITSEG:0x4000-12处;
/*
* 许多BIOS的默认磁盘参数表将不能
* 进行扇区数大于在表中指定
* 的最大扇区数( - 在某些情况下
* 这意味着是7个扇区)后面的多扇区的读操作。
*
* 由于单个扇区的读操作是很慢的而且当然是没问题的,
* 我们必须在RAM中(为第一个磁盘)创建新的参数表。
* 我们将把最大扇区数设置为36 - 我们在一个ED 2.88驱动器上所能
* 遇到的最大值。
*
* 此值太高是没有任何害处的,但是低的话就会有问题了。
*
* 段寄存器是这样的: ds=es=ss=cs - INITSEG,(=0X9000)
* fs = 0, gs没有用到。
*/
! 上面执行重复操作(rep)以后,cx为0;
mov fs,cx !! 置fs段寄存器=0;
mov bx,#0x78 ! fs:bx是磁盘参数表的地址;
push ds
seg fs
lds si,(bx) ! ds:si是源地址;
!! 将fs:bx地址所指的指针值放入ds:si中;
mov cl,#6 ! 拷贝12个字节到0x9000:0x4000-12开始处;
cld
push di !! 指针0x9000:0x4000-12处;
rep
movsw
pop di !! di仍指向0x9000:0x4000-12处(参数表开始处);
pop si !! ds => si=INITSEG(=0X9000);
movb 4(di),*36 ! 修正扇区计数值;
seg fs
mov (bx),di !! 修改fs:bx(0000:0x0078)处磁盘参数表的地址为0x9000:0x4000-12;
seg fs
mov 2(bx),es
! 将setup程序所在的扇区(setup-sectors)直接加载到boot块的后面。!! 0x90200开始处
;
! 注意,es已经设置好了。
! 同样经过rep循环后cx为0
load_setup:
xor ah,ah ! 复位软驱(FDC);
xor dl,dl
int 0x13
xor dx,dx ! 驱动器0, 磁头0;
mov cl,#0x02 ! 从扇区2开始,磁道0;
mov bx,#0x0200 ! 置数据缓冲区地址=es:bx=0x9000:0x200;
! 在INITSEG段中,即0x90200处;
mov ah,#0x02 ! 要调用功能号2(读操作);
mov al,setup_sects ! 要读入的扇区数SETUPSECS=4;
! (假释所有数据都在磁头0、磁道0);
int 0x13 ! 读操作;
jnc ok_load_setup ! ok则继续;
push ax ! 否则显示出错信息。保存ah的值(功能号2);
call print_nl !! 打印换行;
mov bp,sp !! bp将作为调用print_hex的参数;
call print_hex !! 打印bp所指的数据;
pop ax
jmp load_setup !! 重试!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!INT 13 - DISK - READ SECTOR(S) INTO MEMORY
!! AH = 02h
!! AL = number of sectors to read (must be nonzero)
!! CH = low eight bits of cylinder number
!! CL = sector number 1-63 (bits 0-5)
!! high two bits of cylinder (bits 6-7, hard disk only)
!! DH = head number
!! DL = drive number (bit 7 set for hard disk)
!! ES:BX -> data buffer
!! Return: CF set on error
!! if AH = 11h (corrected ECC error), AL = burst length
!! CF clear if successful
!! AH = status (see #00234)
!! AL = number of sectors transferred (only valid if CF set for some
!! BIOSes)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ok_load_setup:
! 取得磁盘驱动器参数,特别是每磁道扇区数(nr of sectors/track);
#if 0
! bde - Phoenix BIOS手册中提到功能0x08只对硬盘起作用。
! 但它对于我的一个BIOS(1987 Award)不起作用。
! 不检查错误码是致命的错误。
xor dl,dl
mov ah,#0x08 ! AH=8用于取得驱动器参数;
int 0x13
xor ch,ch
!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! INT 13 - DISK - GET DRIVE PARAMETERS (PC,XT286,CONV,PS,ESDI,SCSI)
!! AH = 08h
!! DL = drive (bit 7 set for hard disk)
!!Return: CF set on error
!! AH = status (07h) (see #00234)
!! CF clear if successful
!! AH = 00h
!! AL = 00h on at least some BIOSes
!! BL = drive type (AT/PS2 floppies only) (see #00242)
!! CH = low eight bits of maximum cylinder number
!! CL = maximum sector number (bits 5-0)
!! high two bits of maximum cylinder number (bits 7-6)
!! DH = maximum head number
!! DL = number of drives
!! ES:DI -> drive parameter table (floppies only)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!
#else
! 好象没有BIOS调用可取得扇区数。如果扇区36可以读就推测是36个扇区,
! 如果扇区18可读就推测是18个扇区,如果扇区15可读就推测是15个扇区,
! 否则推测是9. [36, 18, 15, 9]
mov si,#disksizes ! ds:si->要测试扇区数大小的表;
probe_loop:
lodsb !! ds:si所指的字节 =>al, si=si+1;
cbw ! 扩展为字(word);
mov sectors, ax ! 第一个值是36,最后一个是9;
cmp si,#disksizes+4
jae got_sectors ! 如果所有测试都失败了,就试9;
xchg ax,cx ! cx = 磁道和扇区(第一次是36=0x0024);
xor dx,dx ! 驱动器0,磁头0;
xor bl,bl !! 设置缓冲区es:bx = 0x9000:0x0a00(578.5k);
mov bh,setup_sects !! setup_sects = 4 (共2k);
inc bh
shl bh,#1 ! setup后面的地址(es=cs);
mov ax,#0x0201 ! 功能2(读),1个扇区;
int 0x13
jc probe_loop ! 如果不对,就试用下一个值;
#endif
got_sectors:
! 恢复es
mov ax,#INITSEG
mov es,ax ! es = 0x9000;
! 打印一些无用的信息(换行后,显示Loading)
mov ah,#0x03 ! 读光标位置;
xor bh,bh
int 0x10
mov cx,#9
mov bx,#0x0007 ! 页0,属性7 (normal);
mov bp,#msg1
mov ax,#0x1301 ! 写字符串,移动光标;
int 0x10
! ok, 我们已经显示出了信息,现在
! 我们要加载系统了(到0x10000处)(64k处)
mov ax,#SYSSEG
mov es,ax ! es=0x01000的段;
call read_it !! 读system,es为输入参数;
call kill_motor !! 关闭驱动器马达;
call print_nl !! 打印回车换行;
! 这以后,我们来检查要使用哪个根设备(root-device)。如果已指定了设备(!=0)
! 则不做任何事而使用给定的设备。否则的话,使用/dev/fd0H2880 (2,32)或/dev/PS0
(2,28)
! 或者是/dev/at0 (2,8)之一,这取决于我们假设我们知道的扇区数而定。
!! |__ ps0?? (x,y)--表示主、次设备号?
seg cs
mov ax,root_dev
or ax,ax
jne root_defined
seg cs
mov bx,sectors !! sectors = 每磁道扇区数;
mov ax,#0x0208 ! /dev/ps0 - 1.2Mb;
cmp bx,#15
je root_defined
mov al,#0x1c ! /dev/PS0 - 1.44Mb !! 0x1C = 28;
cmp bx,#18
je root_defined
mov al,0x20 ! /dev/fd0H2880 - 2.88Mb;
cmp bx,#36
je root_defined
mov al,#0 ! /dev/fd0 - autodetect;
root_defined:
seg cs
mov root_dev,ax !! 其中保存由设备的主、次设备号;
! 这以后(所有程序都加载了),我们就跳转至
! 被直接加载到boot块后面的setup程序去:
jmpi 0,SETUPSEG !! 跳转到0x9020:0000(setup程序的开始位置);
! 这段程序将系统(system)加载到0x10000(64k)处,
! 注意不要跨越64kb边界。我们试图以最快的速度
! 来加载,只要可能就整个磁道一起读入。
!
! 输入(in): es - 开始地址段(通常是0x1000)
!
sread: .word 0 ! 当前磁道已读的扇区数;
head: .word 0 ! 当前磁头;
track: .word 0 ! 当前磁道;
read_it:
mov al,setup_sects
inc al
mov sread,al !! 当前sread=5;
mov ax,es !! es=0x1000;
test ax,#0x0fff !! (ax AND 0x0fff, if ax=0x1000 then zero-flag=1 );
die: jne die ! es 必须在64kB的边界;
xor bx,bx ! bx 是段内的开始地址;
rp_read:
#ifdef __BIG_KERNEL__
#define CALL_HIGHLOAD_KLUDGE .word 0x1eff, 0x220 ! 调用 far * bootsect_kludge
! 注意: as86不能汇编这;
CALL_HIGHLOAD_KLUDGE ! 这是在setup.S中的程序;
#else
mov ax,es
sub ax,#SYSSEG ! 当前es段值减system加载时的启始段值(0x1000);
#endif
cmp ax,syssize ! 我们是否已经都加载了?(ax=0x7f00 ?);
jbe ok1_read !! if ax <= syssize then 继续读;
ret !! 全都加载完了,返回!
ok1_read:
mov ax,sectors !! sectors=每磁道扇区数;
sub ax,sread !! 减去当前磁道已读扇区数,al=当前磁道未读的扇区数(ah=0);
mov cx,ax
shl cx,#9 !! 乘512,cx = 当前磁道未读的字节数;
add cx,bx !! 加上段内偏移值,es:bx为当前读入的数据缓冲区地址;
jnc ok2_read !! 如果没有超过64K则继续读;
je ok2_read !! 如果正好64K也继续读;
xor ax,ax
sub ax,bx
shr ax,#9
ok2_read:
call read_track !! es:bx ->缓冲区,al=要读的扇区数,也即当前磁道未读的扇区数;
mov cx,ax !! ax仍为调用read_track之前的值,即为读入的扇区数;
add ax,sread !! ax = 当前磁道已读的扇区数;
cmp ax,sectors !! 已经读完当前磁道上的扇区了吗?
jne ok3_read !! 没有,则跳转;
mov ax,#1
sub ax,head !! 当前是磁头1吗?
jne ok4_read !! 不是(是磁头0)则跳转(此时ax=1);
inc track !! 当前是磁头1,则读下一磁道(当前磁道加1);
ok4_read:
mov head,ax !! 保存当前磁头号;
xor ax,ax !! 本磁道已读扇区数清零;
ok3_read:
mov sread,ax !! 存本磁道已读扇区数;
shl cx,#9 !! 刚才一次读操作读入的扇区数 * 512;
add bx,cx !! 调整数据缓冲区的起始指针;
jnc rp_read !! 如果该指针没有超过64K的段内最大偏移量,则跳转继续读操作;
mov ax,es !! 如果超过了,则将段地址加0x1000(下一个64K段);
add ah,#0x10
mov es,ax
xor bx,bx !! 缓冲区地址段内偏移量置零;
jmp rp_read !! 继续读操作;
read_track:
pusha !! 将寄存器ax,cx,dx,bx,sp,bp,si,di压入堆栈;
pusha
mov ax,#0xe2e ! loading... message 2e = . !! 显示一个.
mov bx,#7
int 0x10
popa
mov dx,track !! track = 当前磁道;
mov cx,sread
inc cx !! cl = 扇区号,要读的起始扇区;
mov ch,dl !! ch = 磁道号的低8位;
mov dx,head !!
mov dh,dl !! dh = 当前磁头号;
and dx,#0x0100 !! dl = 驱动器号(0);
mov ah,#2 !! 功能2(读),es:bx指向读数据缓冲区;
push dx ! 为出错转储保存寄存器的值到堆栈上;
push cx
push bx
push ax
int 0x13
jc bad_rt !! 如果出错,则跳转;
add sp, #8 !! 清(放弃)堆栈上刚推入的4个寄存器值;
popa
ret
bad_rt: push ax ! 保存出错码;
call print_all ! ah = error, al = read;
xor ah,ah
xor dl,dl
int 0x13
add sp,#10
popa
jmp read_track
/*
* print_all是用于调试的。
* 它将打印出所有寄存器的值。所作的假设是
* 从一个子程序中调用的,并有如下所示的堆栈帧结构
* dx
* cx
* bx
* ax
* error
* ret <- sp
*
*/
print_all:
mov cx,#5 ! 出错码 + 4个寄存器
mov bp,sp
print_loop:
push cx ! 保存剩余的计数值
call print_nl ! 为了增强阅读性,打印换行
cmp cl, #5
jae no_reg ! 看看是否需要寄存器的名称
mov ax,#0xe05 + A - l
sub al,cl
int 0x10
mov al,#X
int 0x10
mov al,#:
int 0x10
no_reg:
add bp,#2 ! 下一个寄存器
call print_hex ! 打印值
pop cx
loop print_loop
ret
print_nl: !! 打印回车换行。
mov ax,#0xe0d ! CR
int 0x10
mov al,#0xa ! LF
int 0x10
ret
/*
* print_hex是用于调试目的的,打印出
* ss:bp所指向的十六进制数。
* !! 例如,十六进制数是0x4321时,则al分别等于4,3,2,1调用中断打印出来 4321
*/
print_hex:
mov cx, #4 ! 4个十六进制数字
mov dx, (bp) ! 将(bp)所指的值放入dx中
print_digit:
rol dx, #4 ! 循环以使低4比特用上 !! 取dx的高4比特移到低4比特处。
mov ax, #0xe0f ! ah = 请求的功能值,al = 半字节(4个比特)掩码。
and al, dl !! 取dl的低4比特值。
add al, #0x90 ! 将al转换为ASCII十六进制码(4个指令)
daa !! 十进制调整
adc al, #0x40 !! (adc dest, src ==> dest := dest + src + c )
daa
int 0x10
loop print_digit
ret
/*
* 这个过程(子程序)关闭软驱的马达,这样
* 我们进入内核后它的状态就是已知的,以后也就
* 不用担心它了。
*/
kill_motor:
push dx
mov dx,#0x3f2
xor al,al
outb
pop dx
ret
!! 数据区
sectors:
.word 0 !! 当前每磁道扇区数。(36||18||15||9)
disksizes: !! 每磁道扇区数表
.byte 36, 18, 15, 9
msg1:
.byte 13, 10
.ascii "Loading"
.org 497 !! 从boot程序的二进制文件的497字节开始
setup_sects:
.byte SETUPSECS
root_flags:
.word CONFIG_ROOT_RDONLY
syssize:
.word SYSSIZE
swap_dev:
.word SWAP_DEV
ram_size:
.word RAMDISK
vid_mode:
.word SVGA_MODE
root_dev:
.word ROOT_DEV
boot_flag: !! 分区启动标志
.word 0xAA55
setup.S
A summary of the setup.S code 。The slight differences in the operation of setup.S due to a big kernel is documented here. When the switch to 32 bit protected mode begins the code32_start address is defined as 0x100000 (when loaded) here.
code32_start:
#ifndef __BIG_KERNEL__
.long 0x1000
#else
.long 0x100000
#endif
After setting the keyboard repeat rate to a maximum, calling video.S, storing the video parameters, checking for the hard disks, PS/2 mouse, and APM BIOS the preparation for real mode switch begins.
The interrupts are disabled. Since the loader changed the code32_start address, the code32 varable is updated. This would be used for the jmpi instruction when the setup.S finally jumps to compressed/head.S. In case of a big kernel this is loacted at 0x100000.
seg cs
mov eax, code32_start !modified above by the loader
seg cs
mov code32,eax
!code32 contains the correct address to branch to after setup.S finishes After the above code there is a slight difference in the ways the big and small kernels are dealt. In case of a small kernel the kernel is moved down to segment address 0x100, but a big kernel is not moved. Before decompression, the big kernel stays at 0x100000. The following is the code that does thischeck.test byte ptr loadflags,
#LOADED_HIGH
jz do_move0 ! a normal low loaded zImage is moved
jmp end_move ! skip move
The interrupt and global descriptors are initialized:
lidt idt_48 ! load idt wit 0,0
lgdt gdt_48 ! load gdt with whatever appropriate
After enabling A20 and reprogramming the interrupts, it is ready to set the PE bit:
mov ax,#1
lmsw ax
jmp flush_instr
flush_instr:
xor bx.bx !flag to indicate a boot
! Manual, mixing of 16-bit and 32 bit code
db 0x166,0xea !prefix jmpi-opcode
code32: dd ox1000 !this has been reset in caes of a big kernel, to 0x100000
dw __KERNEL_CS
Finally it prepares the opcode for jumping to compressed/head.S which in the big kernel is at 0x100000. The compressed kernel would start at 0x1000 in case of a small kernel.
compressed/head.S
When setup.S relinquishes control to compressed/head.S at beginning of the compressed kernmel at 0x100000. It checks to see if A20 is really enabled otherwise it loops forever.
Itinitializes eflags, and clears BSS (Block Start by Symbol) creating reserved space for uninitialized static or global variables. Finally it reserves place for the moveparams structure (defined in misc.c) and pushes the current stack pointer on the stack and calls the C function decompress_kernel which takes a struct moveparams * as an argument
subl $16,%esp
pushl %esp
call SYMBOL_NAME(decompress_kernel)
orl ??,??
jnz 3f
Te C function decompress_kernel returns the variable high_loaded which is set to 1 in the function setup_output_buffer_if_we_run_high, which is called in decompressed_kernel if a big kernel was loaded.
When decompressed_kernel returns, it jumps to 3f which moves the move routine.
movl $move_routine_start,%esi ! puts the offset of the start of the source in the source index register
mov $0x1000,?? ! the destination index now contains 0x1000, thus after move, the move routine starts at 0x1000
movl $move_routine_end,??
sub %esi,?? ! ecx register now contains the number of bytes to be moved
! (number of bytes between the labels move_routine_start and move_routine_end)
cld
rep
movsb ! moves the bytes from ds:si to es:di, in each loop it increments si and di, and decrements cx
! the movs instruction moves till ecx is zero
Thus the movsb instruction moves the bytes of the move routine between the labels move_routine_start and move_routine_end. At the end the entire move routine labeled move_routine_start is at 0x1000. The movsb instruction moves bytes from ds:si to es:si.
At the start of the head.S code es,ds,fs,gs were all intialized to __KERNEL_DS, which is defined in /usr/src/linux/include/asm/segment.h as 0x18. This is the offset from the goobal descriptor table gdtwhich was setup in setup.S. The 24th byte is the start of the data segment descriptor, which has the base address = 0. Thus the moe routine is moved and
starts at offset 0x1000 from __KERNEL_DS, the kernel data segment base (which is 0).
The salient features of what is done by the decompress_kernel is discussed in the next section but it is worth noting that the when the decompressed_kernel function is invoked, space was created at the top of the stack to contain the information about the decompressed kernel. The decompressed kernel if big may be in the high buffer and in the low buffer. After the decompressed_kernel function returns, the decompressed kernel has to be moved so that we
have a contiguous decompressed kernel starting from address 0x100000. To move the decompressed kernel, the important parameters needed are the start addresses of the high buffer and low buffer, and the number of bytes in the high and low buffers. This is at the top of the stack when decompressed_kernel returns (the top of the stack was passed as an argument : struct moveparams*, and in the function the fileds of the moveparams struture was adjusted toreflect the state of the decompression.)
/* in compressed/misc.c */
struct moveparams {
uch *low_buffer_start; ! start address of the low buffer
int count; ! number of bytes in the low buffer after decompression is doneuch *high_buffer_start; ! start address of the high buffer
int hcount; ! number of bytes in the high buffer aftre decompression is done
};
Thus when the decompressed_kernel returns, the relevant bytes are popped in the respective registers as shown below. After preparing these registers the decompressed kernel is ready to be moved and the control jumps to the moved move routine at __KERNEL_CS:0x1000. The code for setting the appropriate registers is given below:
popl %esi ! discard the address, has the return value (high_load) most probably
popl %esi ! low_buffer_start
popl ?? ! lcount
popl ?? ! high_buffer_count
popl ?? ! hcount
movl %0x100000,??
cli ! disable interrutps when the decompressed kernel is being moved
ljmp $(__KERNEL_CS), $0x1000 ! jump to the move routine which was moved to low memory, 0x1000
The move_routine_start basically has two parts, first it moves the part of the decompressed kernel in the low buffer, then it moves (if required) the high buffer contents. It should be noted that the ecx has been intialized to the number of bytes in the low end buffer, and the destination index register di has been intialized to 0x100000.
move_routine_start:
rep ! repeat, it stops repeating when ecx == 0
movsb ! the movsb instruction repeats till ecx is 0. In each loop byte is transferred from ds:esi to es:edi! In each loop the edi and the esi are incremented and ecx is decremented
! when the low end buffer has been moved the value of di is not changed and the next pasrt of the code! uses it to transfer the bytes from the high buffer
movl ??,%esi ! esi now has the offset corresponding to the start of the high buffer
movl ??,?? ! ecx is now intialized to the number of bytes in the high buffer
rep
movsb ! moves all the bytes in the high buffer, and doesn’t move at all if hcount was zero (if it was determined, in! close_output_buffer_if_we_run_high that the high buffer need not be moveddown )
xorl ??,??
mov $0x90000, %esp ! stack pointer is adjusted, most probably to be used by the kernel in the intialization
ljmp $(__KERNEL_CS), $0x100000 ! jump to __KERNEL_CS:0X100000, where the kernel code starts
move_routine_end:At the end of the this the control goes to the kernel code segment.
Linux Assembly code taken from head.S and setup.S
Comment code added by us
在arch/i386 的 Makefile 中定义了
HEAD := arch/i386/kernel/head.o
而在linux总的Makefile中由这样的语句
include arch/$(ARCH)/Makefile
说明HEAD定义在该文件中有效
然后由如下语句:
vmlinux: $(CONFIGURATION) init/main.o init/version.o linuxsubdirs
$(LD) $(LINKFLAGS) $(HEAD) init/main.o init/version.o /
$(ARCHIVES) /
$(FILESYSTEMS) /
$(DRIVERS) /
$(LIBS) -o vmlinux
$(NM) vmlinux | grep -v '/(compiled/)/|/(/.o$$/)/|/( a /)' | sort > System.map
从这个依赖关系我们可以获得大量的信息
1>$(HEAD)即head.o的确第一个被连接到核心中
2>所有内核中支持的文件系统全部编译到$(FILESYSTEMS)即fs/filesystems.a中
所有内核中支持的网络协议全部编译到net.a中
所有内核中支持的SCSI驱动全部编译到scsi.a中
...................
原来内核也不过是一堆库文件和目标文件的集合罢了,有兴趣对内核减肥的同学,
可以好好比较一下看究竟是那个部分占用了空间。
3>System.map中包含了所有的内核输出的函数,我们在编写内核模块的时候
可以调用的系统函数大概就这些了。
好了,消除了心中的疑问,我们可以仔细分析head.s了。
Head.S分析
1 首先将ds,es,fs,gs指向系统数据段KERNEL_DS
KERNEL_DS 在asm/segment.h中定义,表示全局描述符表中
中的第三项。
注意:该此时生效的全局描述符表并不是在head.s中定义的
而仍然是在setup.S中定义的。
2 数据段全部清空。
3 setup_idt为一段子程序,将中断向量表全部指向ignore_int函数
该函数打印出:unknown interrupt
当然这样的中断处理函数什么也干不了。
4 察看数据线A20是否有效,否则循环等待。
地址线A20是x86的历史遗留问题,决定是否能访问1M以上内存。
5 拷贝启动参数到0x5000页的前半页,而将setup.s取出的bios参数
放到后半页。
6 检查CPU类型
@#$#%$^*@^?(^%#$%!#!@?谁知道干了什么?
7 初始化页表,只初始化最初几页。
1>将swapper_pg_dir(0x2000)和pg0(0x3000)清空
swapper_pg_dir作为整个系统的页目录
2>将pg0作为第一个页表,将其地址赋到swapper_pg_dir的第一个32
位字中。
3>同时将该页表项也赋给swapper_pg_dir的第3072个入口,表示虚拟地址
0xc0000000也指向pg0。
4>将pg0这个页表填满指向内存前4M
5>进入分页方式
注意:以前虽然在在保护模式但没有启用分页。
--------------------
| swapper_pg_dir | -----------
| |-------| pg0 |----------内存前4M
| | -----------
| |
--------------------
8 装入新的gdt和ldt表。
9 刷新段寄存器ds,es,fs,gs
10 使用系统堆栈,即预留的0x6000页面
11 执行start_kernel函数,这个函数是第一个C编制的
函数,内核又有了一个新的开始。
int decompress_kernel(struct moveparams *mv)
{
if (SCREEN_INFO.orig_video_mode == 7) {
vidmem = (char *) 0xb0000;
vidport = 0x3b4;
} else {
vidmem = (char *) 0xb8000;
vidport = 0x3d4;
}
lines = SCREEN_INFO.orig_video_lines;
cols = SCREEN_INFO.orig_video_cols;
if (free_mem_ptr < 0x100000) setup_normal_output_buffer(); // Call if smallkernel
else setup_output_buffer_if_we_run_high(mv); // Call if big kernel
makecrc();
puts("Uncompressing Linux... ");
gunzip();
puts("Ok, booting the kernel./n");
if (high_loaded) close_output_buffer_if_we_run_high(mv);
return high_loaded;
}
The first place where a distinction is made is when the buffers are to be setup for the decmpression routine gunzip(). Free_mem_ptr, is loaded with the value of the address of the extern variabe end. The variable end marks the end of the compressed kernel. If the free_mem-ptr is less than the 0x100000,then a high buffer has to be setup. Thus the function setup_output_buffer_if_we_run_high is called and the pointer to the top of the moveparams structure is passed so that when the buffers are setup, the start addresses fields are updated in moveparams structure. It is also checked to see if the high buffer needs to be moved down after decompression and this is reflected by the hcount which is 0 if we need not move the high buffer down.
void setup_output_buffer_if_we_run_high(struct moveparams *mv)
{
high_buffer_start = (uch *)(((ulg)&end) HEAP_SIZE);
//the high buffer start address is at the end HEAP_SIZE
#ifdef STANDARD_MEMORY_BIOS_CALL
if (EXT_MEM_K < (3*1024)) error("Less than 4MB of memory./n");
#else
if ((ALT_MEM_K > EXT_MEM_K ? ALT_MEM_K : EXT_MEM_K) < (3*1024)) error("Less
than 4MB of memory./n");
#endif
mv->low_buffer_start = output_data = (char *)LOW_BUFFER_START;
//the low buffer start address is at 0x2000 and it extends till 0x90000.
high_loaded = 1; //high_loaded is set to 1, this is returned by decompressed_kernel
free_mem_end_ptr = (long)high_buffer_start;
// free_mem_end_ptr points to the same address as te high_buffer_start
// the code below finds out if the high buffer needs to be moved after decompression
// if the size if the low buffer is > the size of the compressed kernel and the HEAP_SIZE
// then the high_buffer_start has to be shifted up so that when the decompression starts it doesn’t
// overwrite the compressed kernel data. Thus when the high_buffer_start islow then it is shifted
// up to exactly match the end of the compressed kernel and the HEAP_SIZE. The hcount filed is
// is set to 0 as the high buffer need not be moved down. Otherwise if the high_buffer_start is too
// high then the hcount is non zero and while closing the buffers the appropriate number of bytes
// in the high buffer is asigned to the filed hcount. Since the start address of the high buffer is
// known the bytes could be moved down
if ( (0x100000 LOW_BUFFER_SIZE) > ((ulg)high_buffer_start)) {
high_buffer_start = (uch *)(0x100000 LOW_BUFFER_SIZE);
mv->hcount = 0; /* say: we need not to move high_buffer */
}
else mv->hcount = -1;
mv->high_buffer_start = high_buffer_start;
// finally the high_buffer_start field is set to the varaible high_buffer_start
}
After the buffers are set gunzip() is invoked which decompresses the kernel Upon return, bytes_out has the number of bytes in the decompressed kernel.Finally close_output_buffer_if_we_run_high is invoked if high_loaded is non zero:
void close_output_buffer_if_we_run_high(struct moveparams *mv)
{
mv->lcount = bytes_out;
// if the all of decompressed kernel is in low buffer, lcount = bytes_out
if (bytes_out > LOW_BUFFER_SIZE) {
// if there is a part of the decompressed kernel in the high buffer, the lcount filed is set to
// the size of the low buffer and the hcount field contains the rest of the bytes
mv->lcount = LOW_BUFFER_SIZE;
if (mv->hcount) mv->hcount = bytes_out - LOW_BUFFER_SIZE;
// if the hcount field is non zero (made in setup_output_buffer_if_we_run_high)
// then the high buffer has to be moved doen and the number of bytes in the high buffer is
// in hcount
}
else mv->hcount = 0; // all the data is in the high buffer
}
Thus at the end of the the decompressed_kernel function the top of the stack has the addresses of the buffers and their sizes which is popped and the appropriate registers set for the move routine to move the entire kernel. After the move by the move_routine the kernel resides at 0x100000. If a small kernel is being decompressed then the setup_normal_output_buffer() is invoked from decompressed_kernel, which just initializes output_data to 0x100000 where the decompressed kernel would lie. The variable high_load is still 0 as setup_output_buffer_if_we_run_high() is not invoked. Decompression is done starting at address 0x100000. As high_load is 0, when decompressed_kernel returns in head.S, a zero is there in the eax. Thus the control jumps directly to 0x100000. Since the decompressed kernel lies there directly and the move routine need not be called.
Linux code taken from misc.c
Comment code added by us
1) Linux的初始内核映象以gzip压缩文件的格式存放在zImage或bzImage之中, 内核的自举代码将它解压到1M内存开始处. 在内核初始化时, 如果加载了压缩的initrd映象, 内核会将它解压到内存盘中, 这两处解压过程都使用了lib/inflate.c文件.
2) inflate.c是从gzip源程序中分离出来的, 包含了一些对全局数据的直接引用, 在使用时需要直接嵌入到代码中. gzip压缩文件时总是在前32K字节的范围内寻找重复的字符串进行编码, 在解压时需要一个至少为32K字节的解压缓冲区, 它定义为window[WSIZE].inflate.c使用get_byte()读取输入文件, 它被定义成宏来提高效率. 输入缓冲区指针必须定义为inptr, inflate.c中对之有减量操作. inflate.c调用flush_window()来输出window缓冲区中的解压出的字节串, 每次输出长度用outcnt变量表示. 在flush_window()中, 还必须对输出字节串计算CRC并且刷新crc变量. 在调用gunzip()开始解压之前, 调用makecrc()初始化CRC计算表. 最后gunzip()返回0表示解压成功.
3) zImage或bzImage由16位引导代码和32位内核自解压映象两个部分组成. 对于zImage, 内核自解压映象被加载到物理地址0x1000, 内核被解压到1M的部位. 对于bzImage, 内核自解压映象被加载到1M开始的地方, 内核被解压为两个片段, 一个起始于物理地址0x2000-0x90000,另一个起始于高端解压映象之后, 离1M开始处不小于低端片段最大长度的区域. 解压完成后,这两个片段被合并到1M的起始位置.
解压根内存盘映象文件的代码
--------------------------
; drivers/block/rd.c
#ifdef BUILD_CRAMDISK
/*
* gzip declarations
*/
#define OF(args) args ; 用于函数原型声明的宏
#ifndef memzero
#define memzero(s, n) memset ((s), 0, (n))
#endif
typedef unsigned char uch; 定义inflate.c所使用的3种数据类型
typedef unsigned short ush;
typedef unsigned long ulg;
#define INBUFSIZ 4096 用户输入缓冲区尺寸
#define WSIZE 0x8000 /* window size--must be a power of two, and */
/* at least 32K for zip's deflate method */
static uch *inbuf; 用户输入缓冲区,与inflate.c无关
static uch *window; 解压窗口
static unsigned insize; /* valid bytes in inbuf */
static unsigned inptr; /* index of next byte to be processed in inbuf */
static unsigned outcnt; /* bytes in output buffer */
static int exit_code;
static long bytes_out; 总解压输出长度,与inflate.c无关
static struct file *crd_infp, *crd_outfp;
#define get_byte() (inptr
/* Diagnostic functions (stubbed out) */ 一些调试宏
#define Assert(cond,msg)
#define Trace(x)
#define Tracev(x)
#define Tracevv(x)
#define Tracec(c,x)
#define Tracecv(c,x)
#define STATIC static
static int fill_inbuf(void);
static void flush_window(void);
static void *malloc(int size);
static void free(void *where);
static void error(char *m);
static void gzip_mark(void **);
static void gzip_release(void **);
#include "../../lib/inflate.c"
static void __init *malloc(int size)
{
return kmalloc(size, GFP_KERNEL);
}
static void __init free(void *where)
{
kfree(where);
}
static void __init gzip_mark(void **ptr)
{
; 读取用户一个标记
}
static void __init gzip_release(void **ptr)
{
; 归还用户标记
}
/* ===========================================================================
* Fill the input buffer. This is called only when the buffer is empty
* and at least one byte is really needed.
*/
static int __init fill_inbuf(void) 填充输入缓冲区
{
if (exit_code) return -1;
insize = crd_infp->f_op->read(crd_infp, inbuf, INBUFSIZ,
if (insize == 0) return -1;
inptr = 1;
return inbuf[0];
}
/* ===========================================================================
* Write the output window window[0..outcnt-1] and update crc and bytes_out.
* (Used for the decompressed data only.)
*/
static void __init flush_window(void) 输出window缓冲区中outcnt个字节串
{
ulg c = crc; /* temporary variable */
unsigned n;
uch *in, ch;
crd_outfp->f_op->write(crd_outfp, window, outcnt,
in = window;
for (n = 0; n ch = *in++;
c = crc_32_tab[((int)c ^ ch) 0xff] ^ (c >> 8); 计算输出串的CRC
}
crc = c;
bytes_out += (ulg)outcnt; 刷新总字节数
outcnt = 0;
}
static void __init error(char *x) 解压出错调用的函数
{
printk(KERN_ERR "%s", x);
exit_code = 1;
}
static int __init
crd_load(struct file * fp, struct file *outfp)
{
int result;
insize = 0; /* valid bytes in inbuf */
inptr = 0; /* index of next byte to be processed in inbuf */
outcnt = 0; /* bytes in output buffer */
exit_code = 0;
bytes_out = 0;
crc = (ulg)0xffffffffL; /* shift register contents */
crd_infp = fp;
crd_outfp = outfp;
inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
if (inbuf == 0) {
printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer/n");
return -1;
}
window = kmalloc(WSIZE, GFP_KERNEL);
if (window == 0) {
printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window/n");
kfree(inbuf);
return -1;
}
makecrc();
result = gunzip();
kfree(inbuf);
kfree(window);
return result;
}
#endif /* BUILD_CRAMDISK */
32位内核自解压代码
------------------
; arch/i386/boot/compressed/head.S
.text
#include ·
#include
.globl startup_32 对于zImage该入口地址为0x1000; 对于bzImage为0x101000
startup_32:
cld
cli
movl $(__KERNEL_DS),%eax
movl %eax,%ds
movl %eax,%es
movl %eax,%fs
movl %eax,%gs
lss SYMBOL_NAME(stack_start),%esp # 自解压代码的堆栈为misc.c中定义的16K字节的数组
xorl %eax,%eax
1: incl %eax # check that A20 really IS enabled
movl %eax,0x000000 # loop forever if it isn't
cmpl %eax,0x100000
je 1b
/*
* Initialize eflags. Some BIOS's leave bits like NT set. This would
* confuse the debugger if this code is traced.
* XXX - best to initialize before switching to protected mode.
*/
pushl $0
popfl
/*
* Clear BSS 清除解压程序的BSS段
*/
xorl %eax,%eax
movl $ SYMBOL_NAME(_edata),%edi
movl $ SYMBOL_NAME(_end),%ecx
subl %edi,%ecx
cld
rep
stosb
/*
* Do the decompression, and jump to the new kernel..
*/
subl $16,%esp # place for structure on the stack
movl %esp,%eax
pushl %esi # real mode pointer as second arg
pushl %eax # address of structure as first arg
call SYMBOL_NAME(decompress_kernel)
orl %eax,%eax # 如果返回非零,则表示为内核解压为低端和高端的两个片断
jnz 3f
popl %esi # discard address
popl %esi # real mode pointer
xorl %ebx,%ebx
ljmp $(__KERNEL_CS), $0x100000 # 运行start_kernel
/*
* We come here, if we were loaded high.
* We need to move the move-in-place routine down to 0x1000
* and then start it with the buffer addresses in registers,
* which we got from the stack.
*/
3:
movl $move_routine_start,%esi
movl $0x1000,%edi
movl $move_routine_end,%ecx
subl %esi,%ecx
addl $3,%ecx
shrl $2,%ecx # 按字取整
cld
rep
movsl # 将内核片断合并代码复制到0x1000区域, 内核的片段起始为0x2000
popl %esi # discard the address
popl %ebx # real mode pointer
popl %esi # low_buffer_start 内核低端片段的起始地址
popl %ecx # lcount 内核低端片段的字节数量
popl %edx # high_buffer_start 内核高端片段的起始地址
popl %eax # hcount 内核高端片段的字节数量
movl $0x100000,%edi 内核合并的起始地址
cli # make sure we don't get interrupted
ljmp $(__KERNEL_CS), $0x1000 # and jump to the move routine
/*
* Routine (template) for moving the decompressed kernel in place,
* if we were high loaded. This _must_ PIC-code !
*/
move_routine_start:
movl %ecx,%ebp
shrl $2,%ecx
rep
movsl # 按字拷贝第1个片段
movl %ebp,%ecx
andl $3,%ecx
rep
movsb # 传送不完全字
movl %edx,%esi
movl %eax,%ecx # NOTE: rep movsb won't move if %ecx == 0
addl $3,%ecx
shrl $2,%ecx # 按字对齐
rep
movsl # 按字拷贝第2个片段
movl %ebx,%esi # Restore setup pointer
xorl %ebx,%ebx
ljmp $(__KERNEL_CS), $0x100000 # 运行start_kernel
move_routine_end:
; arch/i386/boot/compressed/misc.c
/*
* gzip declarations
*/
#define OF(args) args
#define STATIC static
#undef memset
#undef memcpy
#define memzero(s, n) memset ((s), 0, (n))
ypedef unsigned char uch;
typedef unsigned short ush;
typedef unsigned long ulg;
#define WSIZE 0x8000 /* Window size must be at least 32k, */
/* and a power of two */
static uch *inbuf; /* input buffer */
static uch window[WSIZE]; /* Sliding window buffer */
static unsigned insize = 0; /* valid bytes in inbuf */
static unsigned inptr = 0; /* index of next byte to be processed in inbuf */
static unsigned outcnt = 0; /* bytes in output buffer */
/* gzip flag byte */
#define ASCII_FLAG 0x01 /* bit 0 set: file probably ASCII text */
#define CONTINUATION 0x02 /* bit 1 set: continuation of multi-part gzip file */
#define EXTRA_FIELD 0x04 /* bit 2 set: extra field present */
#define ORIG_NAME 0x08 /* bit 3 set: original file name present */
#define COMMENT 0x10 /* bit 4 set: file comment present */
#define ENCRYPTED 0x20 /* bit 5 set: file is encrypted */
#define RESERVED 0xC0 /* bit 6,7: reserved */
#define get_byte() (inptr
/* Diagnostic functions */
#ifdef DEBUG
# define Assert(cond,msg) {if(!(cond)) error(msg);}
# define Trace(x) fprintf x
# define Tracev(x) {if (verbose) fprintf x ;}
# define Tracevv(x) {if (verbose>1) fprintf x ;}
# define Tracec(c,x) {if (verbose (c)) fprintf x ;}
# define Tracecv(c,x) {if (verbose>1 (c)) fprintf x ;}
#else
# define Assert(cond,msg)
# define Trace(x)
# define Tracev(x)
# define Tracevv(x)
# define Tracec(c,x)
# define Tracecv(c,x)
#endif
static int fill_inbuf(void);
static void flush_window(void);
static void error(char *m);
static void gzip_mark(void **);
static void gzip_release(void **);
/*
* This is set up by the setup-routine at boot-time
*/
static unsigned char *real_mode; /* Pointer to real-mode data */
#define EXT_MEM_K (*(unsigned short *)(real_mode + 0x2))
#ifndef STANDARD_MEMORY_BIOS_CALL
#define ALT_MEM_K (*(unsigned long *)(real_mode + 0x1e0))
#endif
#define SCREEN_INFO (*(struct screen_info *)(real_mode+0))
extern char input_data[];
extern int input_len;
static long bytes_out = 0;
static uch *output_data;
static unsigned long output_ptr = 0;
static void *malloc(int size);
static void free(void *where);
static void error(char *m);
static void gzip_mark(void **);
static void gzip_release(void **);
static void puts(const char *);
extern int end;
static long free_mem_ptr = (long)
static long free_mem_end_ptr;
#define INPLACE_MOVE_ROUTINE 0x1000 内核片段合并代码的运行地址
#define LOW_BUFFER_START 0x2000 内核低端解压片段的起始地址
#define LOW_BUFFER_MAX 0x90000 内核低端解压片段的终止地址
#define HEAP_SIZE 0x3000 为解压低码保留的堆的尺寸,堆起始于BSS的结束
static unsigned int low_buffer_end, low_buffer_size;
static int high_loaded =0;
static uch *high_buffer_start /* = (uch *)(((ulg) + HEAP_SIZE)*/;
static char *vidmem = (char *)0xb8000;
static int vidport;
static int lines, cols;
#include "../../../../lib/inflate.c"
static void *malloc(int size)
{
void *p;
if (size if (free_mem_ptr
free_mem_ptr = (free_mem_ptr + 3) ~3; /* Align */
p = (void *)free_mem_ptr;
free_mem_ptr += size;
if (free_mem_ptr >= free_mem_end_ptr)
error("/nOut of memory/n");
return p;
}
static void free(void *where)
{ /* Don't care */
}
static void gzip_mark(void **ptr)
{
*ptr = (void *) free_mem_ptr;
}
static void gzip_release(void **ptr)
{
free_mem_ptr = (long) *ptr;
}
static void scroll(void)
{
int i;
memcpy ( vidmem, vidmem + cols * 2, ( lines - 1 ) * cols * 2 );
for ( i = ( lines - 1 ) * cols * 2; i vidmem[ i ] = ' ';
}
static void puts(const char *s)
{
int x,y,pos;
char c;
x = SCREEN_INFO.orig_x;
y = SCREEN_INFO.orig_y;
while ( ( c = *s++ ) != '/0' ) {
if ( c == '/n' ) {
x = 0;
if ( ++y >= lines ) {
scroll();
y--;
}
} else {
vidmem [ ( x + cols * y ) * 2 ] = c;
if ( ++x >= cols ) {
x = 0;
if ( ++y >= lines ) {
scroll();
y--;
}
}
}
}
SCREEN_INFO.orig_x = x;
SCREEN_INFO.orig_y = y;
pos = (x + cols * y) * 2; /* Update cursor position */
outb_p(14, vidport);
outb_p(0xff (pos >> 9), vidport+1);
outb_p(15, vidport);
outb_p(0xff (pos >> 1), vidport+1);
}
void* memset(void* s, int c, size_t n)
{
int i;
char *ss = (char*)s;
for (i=0;i return s;
}
void* memcpy(void* __dest, __const void* __src,
size_t __n)
{
int i;
char *d = (char *)__dest, *s = (char *)__src;
for (i=0;i return __dest;
}
/* ===========================================================================
* Fill the input buffer. This is called only when the buffer is empty
* and at least one byte is really needed.
*/
static int fill_inbuf(void)
{
if (insize != 0) {
error("ran out of input data/n");
}
inbuf = input_data;
insize = input_len;
inptr = 1;
return inbuf[0];
}
/* ===========================================================================
* Write the output window window[0..outcnt-1] and update crc and bytes_out.
* (Used for the decompressed data only.)
*/
static void flush_window_low(void)
{
ulg c = crc; /* temporary variable */
unsigned n;
uch *in, *out, ch;
in = window;
out =
for (n = 0; n ch = *out++ = *in++;
c = crc_32_tab[((int)c ^ ch) 0xff] ^ (c >> 8);
}
crc = c;
bytes_out += (ulg)outcnt;
output_ptr += (ulg)outcnt;
outcnt = 0;
}
static void flush_window_high(void)
{
ulg c = crc; /* temporary variable */
unsigned n;
uch *in, ch;
in = window;
for (n = 0; n ch = *output_data++ = *in++;
if ((ulg)output_data == low_buffer_end) output_data=high_buffer_start;
c = crc_32_tab[((int)c ^ ch) 0xff] ^ (c >> 8);
}
crc = c;
bytes_out += (ulg)outcnt;
outcnt = 0;
}
static void flush_window(void)
{
if (high_loaded) flush_window_high();
else flush_window_low();
}
static void error(char *x)
{
puts("/n/n");
puts(x);
puts("/n/n -- System halted");
while(1); /* Halt */
}
#define STACK_SIZE (4096)
long user_stack [STACK_SIZE];
struct {
long * a;
short b;
} stack_start = { user_stack [STACK_SIZE] , __KERNEL_DS };
void setup_normal_output_buffer(void) 对于zImage, 直接解压到1M
{
#ifdef STANDARD_MEMORY_BIOS_CALL
if (EXT_MEM_K #else
if ((ALT_MEM_K > EXT_MEM_K ? ALT_MEM_K : EXT_MEM_K) #endif
output_data = (char *)0x100000; /* Points to 1M */
free_mem_end_ptr = (long)real_mode;
}
struct moveparams {
uch *low_buffer_start; int lcount;
uch *high_buffer_start; int hcount;
};
void setup_output_buffer_if_we_run_high(struct moveparams *mv)
{
high_buffer_start = (uch *)(((ulg) + HEAP_SIZE); 内核高端片段的最小起始地址
#ifdef STANDARD_MEMORY_BIOS_CALL
if (EXT_MEM_K #else
if ((ALT_MEM_K > EXT_MEM_K ? ALT_MEM_K : EXT_MEM_K) #endif
mv->low_buffer_start = output_data = (char *)LOW_BUFFER_START;
low_buffer_end = ((unsigned int)real_mode > LOW_BUFFER_MAX
? LOW_BUFFER_MAX : (unsigned int)real_mode) ~0xfff;
low_buffer_size = low_buffer_end - LOW_BUFFER_START;
high_loaded = 1;
free_mem_end_ptr = (long)high_buffer_start;
if ( (0x100000 + low_buffer_size) > ((ulg)high_buffer_start)) {
; 如果高端片段的最小起始地址小于它实际应加载的地址,则将它置为实际地址,
; 这样高端片段就无需再次移动了,否则它要向前移动
high_buffer_start = (uch *)(0x100000 + low_buffer_size);
mv->hcount = 0; /* say: we need not to move high_buffer */
}
else mv->hcount = -1; 待定
mv->high_buffer_start = high_buffer_start;
}
void close_output_buffer_if_we_run_high(struct moveparams *mv)
{
if (bytes_out > low_buffer_size) {
mv->lcount = low_buffer_size;
if (mv->hcount)
mv->hcount = bytes_out - low_buffer_size; 求出高端片段的字节数
} else { 如果解压后内核只有低端的一个片段
mv->lcount = bytes_out;
mv->hcount = 0;
}
}
int decompress_kernel(struct moveparams *mv, void *rmode)
{
real_mode = rmode;
if (SCREEN_INFO.orig_video_mode == 7) {
vidmem = (char *) 0xb0000;
vidport = 0x3b4;
} else {
vidmem = (char *) 0xb8000;
vidport = 0x3d4;
}
lines = SCREEN_INFO.orig_video_lines;
cols = SCREEN_INFO.orig_video_cols;
if (free_mem_ptr else setup_output_buffer_if_we_run_high(mv);
makecrc();
puts("Uncompressing Linux... ");
gunzip();
puts("Ok, booting the kernel./n");
if (high_loaded) close_output_buffer_if_we_run_high(mv);
return high_loaded;
}
Edited by lucian_yao on 04/28/01 01:36 PM.
“十一”假期,哪儿也不去,做个程序博各位一笑。
=============================================
1、到底想干什么
了解Linux的启动过程,制作一个自己的Linux启动程序,可以增加对Linux的了解,还能学习PC机的启动机制,增进对计算机结构的了解,增强对Linux内核学习的信心。也可以在某些专用产品中使用(比如专用的服务器)。为此,我尝试在原来代码的基础上修改制作了一个用网络卡从并口上启动Linux的程序,以博一笑,其中有许多问题值得研究。
2、Linux对启动程序的要求
Linux(bzImage Kernel)对启动程序的要求比较简单,你只要能够建立一个启动头(setup.S),给出一些信息,然后将kernel/usr/src/linux/arch/i386/boot/compressed/bvmlinux.out)调到绝对地址0x100000(1M地址处),如果有initrd,则将它调到内存高端(离0x100000越远越好,比如如果initrd小于4M,就可以将它调到地址0xB00000,即12M处,相信现在已经很少有少于16M内存的机器了),然后执行一些初始化操作,跳到内核处就行了。
当然,说起来容易做起来还有点麻烦,以下分几个问题解释。
3、PC机开机流程--启动程序放在何处
PC机加电后,进入实模式,先进行自检,然后初始化各个总线扩展设备(ISA, EISA,PCI,AGP),
全部初始化做完后,从当前启动设备中读一个块(512字节)到07C0:0000处,将控制转到该处。
了解这个过程,我们可以决定将启动程序放在何处:
1)放在启动设备的MBR(主启动记录中),比如磁盘的启动扇区。这是一般的启动方式。
2)放在总线扩展设备的扩展rom中,比如网卡的boot rom就行,这里制作的启动程序就是放在网卡中,可以支持16K字节。
3)哪位高手能够修改ROMBIOS,让BIOS在做完初始化后不要马上从启动设备读数据,而是调用一段外面加入的程序(2K字节就够了,当然也必须与修改后的BIOS一起烧在BIOS ROM中),就可以从BIOS启动!
4)先启动一个操作系统,再在此操作系统中写启动程序(比如lodlin16就是从DOS中启动Linux,好象中软提供了一个从Windows下启动Linux的启动程序)。
4、操作系统放在何处
操作系统(一般内核在500K-1M之间,加上应用程序可以控制在2M以内,当然都经过压缩了)的数据选择余地就大了,可以从软盘、硬盘、CDROM、网络、磁带机、并口(软件狗上烧个内核和应用程序?)、串口(外接你的设备)、USB设备(?)、PCI扩展卡、IC卡等等上面来读;各位还有什么意见,提醒提醒。有位老兄说实在不行可以用键盘启动,每次启动时把内核敲进去,还有int 16h支持呢,做起来也不难,应该是最节省的方案了。
反正一个原则是,在启动程序中能够从该设备上读就行了,这里最简单的就是并口了,简单的端口操作,不需要任何驱动程序支持,不需要BIOS支持,比磁盘还简单(磁盘一般使用int 13h,主要是计算柱面啊、磁头啊、磁道啊、扇区啊好麻烦,幸好有现成的源代码,可以学习学习)。
好了,我们挑个简单的方案,将启动代码(bootsect.S+setup.S)放到网络卡的boot rom中,内核数据和应用数据放到另外一台计算机上,用并口提供。下面谈谈几个相关的问题。
5、将数据移动到绝对地址处
第一个问题,我们得到数据,因为是在实模式下,所以一般是放在1M地址空间内,怎样将它移动到指定的地方去,在setup.S 的源代码中,使用了int 15h(87h号功能)。这里将该段代码稍加改动,做了些假设,列到下面,流程是:
if (%cs:move_es==0)/*由于使用前move_es初始化为0,因此这是第一次调用,此时es:bx是要移动的数据
存放处bx=0,es低四为位为零表示es:bx在64K边界上,fs的低8位指定目的地地址,
也以64K字节为单位,用不着那么精确,以简化操作*/
{
将es右移四位,得到64K单位的8位地址(这样一来,最多只能将数据移动到16M以下了),作为源数据
描述符中24位地址的高8位,低16位为零。
将fs的低8位作为目的地的描述符中24位地址的高8位,同样,它的低16位为零。
将es存放在move_es中,es自然不会是零,因此以后再调用该例程时就进行正常的移动操作了。
ax清零返回。
}
else
{
if (bx==0)/*bx为零,表示数据已经满64K了,应该进行实际的移动*/
{
调用int15h 87h号功能,进行实际的数据移动(64K, 0x8000个16字节块)。
目的地址(24位)高8位增一,往后走64K
ax = 1
return;
}
else
{
ax = 0;
return;
}
}
# we will move %cx bytes from es:bx to %fs(64Kbytes per unit)
# when we first call movetohigh(%cs:move_es is zero),
# the es:bx and %edx is valid
# we configure the param first
# follow calls will move data actually
# %ax return 0 if no data really moved, and return 1 if there is data
# really to be moved
#
movetohigh:
cmpw $0, %cs:move_es
jnz move_second
# at this point , es:bx(bx = 0) is the source address
# %edx is the destination address
movb $0x20, %cs:type_of_loader
movw %es, %ax
shrw $4, %ax
movb %ah, %cs:move_src_base+2
movw %fs, %ax
movb %al, %cs:move_dst_base+2
movw %es, %ax
movw %ax, %cs:move_es
xorw %ax, %ax
ret # nothing else to do for now
move_second:
xorw %ax, %ax
testw %bx, %bx
jne move_ex
pushw %ds
pushw %cx
pushw %si
pushw %bx
movw $0x8000, %cx # full 64K, INT15 moves words
pushw %cs
popw %es
leaw %cs:move_gdt, %si
movw $0x8700, %ax
int $0x15
jc move_panic # this, if INT15 fails
movw %cs:move_es, %es # we reset %es to always point
incb %cs:move_dst_base+2 # to 0x10000
popw %bx
popw %si
popw %cx
popw %ds
movw $1, %ax
move_ex:
ret
move_gdt:
.word 0, 0, 0, 0
.word 0, 0, 0, 0
move_src:
.word 0xffff
move_src_base:
.byte 0x00, 0x00, 0x01 # base = 0x010000
.byte 0x93 # typbyte
.word 0 # limit16,base24 =0
move_dst:
.word 0xffff
move_dst_base:
.byte 0x00, 0x00, 0x10 # base = 0x100000
.byte 0x93 # typbyte
.word 0 # limit16,base24 =0
.word 0, 0, 0, 0 # BIOS CS
.word 0, 0, 0, 0 # BIOS DS
move_es:
.word 0
move_panic:
pushw %cs
popw %ds
cld
leaw move_panic_mess, %si
call prtstr
move_panic_loop:
jmp move_panic_loop
move_panic_mess:
.string "INT15 refuses to access high mem, giving up."
6、用并口传输数据
用并口传输数据,可以从/usr/src/linux/driver/net/plip.c中抄一段,我们采用半字节协议,并口线连接参考该文件。字节收发过程如下:
#define PORT_BASE 0x378
#define data_write(b) outportb(PORT_BASE, b)
#define data_read() inportb(PORT_BASE+1)
#define OK 0
#define TIMEOUT 1
#define FAIL 2
int sendbyte(unsigned char data)
{
unsigned char c0;
unsigned long cx;
data_write((data & 0x0f));
data_write((0x10 | (data & 0x0f)));
cx = 32767l * 1024l;
while (1) {
c0 = data_read();
if ((c0 & 0x80) == 0)
break;
if (--cx == 0)
return TIMEOUT;
}
data_write(0x10 | (data >> 4));
data_write((data >> 4));
cx = 32767l * 1024l;
while (1) {
c0 = data_read();
if (c0 & 0x80)
break;
if (--cx == 0)
return TIMEOUT;
}
return OK;
}
int rcvbyte(unsigned char * pByte)
{
unsigned char c0, c1;
unsigned long cx;
cx = 32767l * 1024l;
while (1) {
c0 = data_read();
if ((c0 & 0x80) == 0) {
c1 = data_read();
if (c0 == c1)
break;
}
if (--cx == 0)
return TIMEOUT;
}
*pByte = (c0 >> 3) & 0x0f;
data_write(0x10); /* send ACK */
cx = 32767l * 1024l;
while (1) {
c0 = data_read();
if (c0 & 0x80) {
c1 = data_read();
if (c0 == c1)
break;
}
if (--cx == 0)
return TIMEOUT;
}
*pByte |= (c0 << 1) & 0xf0;
data_write(0x00); /* send ACK */
return OK;
}
为了能够在setup.S下收字符,特将字符接收子程序该为AT&T汇编语法(也没有什么好办法,在DOS下用TURBO C 2.0将上述代码编译成汇编代码,然后手工转换成AT&T格式,据说有程序可以自动进行这样的转换,有谁用过请指教):
rcvbyte:
pushw %bp
movw %sp, %bp
subw $6, %sp
movw $511, -2(%bp)
movw $-1024, -4(%bp)
jmp .L13
.L15:
movw $889, %dx
inb %dx, %al
movb %al, -6(%bp)
testb $128, -6(%bp)
jne .L16
inb %dx, %al
movb %al, -5(%bp)
movb -6(%bp), %al
cmpb -5(%bp), %al
jne .L17
jmp .L14
.L17:
.L16:
subw $1, -4(%bp)
sbbw $0, -2(%bp)
movw -2(%bp), %dx
movw -4(%bp), %ax
orw %ax, %dx
jne .L18
movw $1, %ax
jmp .L12
.L18:
.L13:
jmp .L15
.L14:
movb -6(%bp), %al
shrb $1, %al
shrb $1, %al
shrb $1, %al
andb $15, %al
movw 4(%bp), %bx
movb %al, (%bx)
movb $16, %al
movw $888, %dx
outb %al, %dx
movw $511, -2(%bp)
movw $-1024, -4(%bp)
jmp .L19
.L21:
movw $889, %dx
inb %dx, %al
movb %al, -6(%bp)
testb $128, %al
je .L22
inb %dx, %al
movb %al, -5(%bp)
movb -6(%bp), %al
cmpb -5(%bp), %al
jne .L23
jmp .L20
.L23:
.L22:
subw $1, -4(%bp)
sbbw $0, -2(%bp)
movw -2(%bp), %dx
movw -4(%bp), %ax
orw %ax, %dx
jne .L24
movw $1, %ax
jmp .L12
.L24:
.L19:
jmp .L21
.L20:
movb -6(%bp), %al
shlb $1, %al
andb $240, %al
movw 4(%bp), %bx
orb %al, (%bx)
xorw %ax, %ax
movw $888, %dx
outb %al, %dx
jmp .L12
.L12:
movw %bp, %sp
popw %bp
ret
能够收发字符还不行,作为协议,总得知道数据的起始和结束,也应该进行简单的检错。这里采用字符填充方式进行数据包编码,用‘/’表示转义字符,数据包头用/H表示,数据包结束用/T表示如果数据中有'/',则用//表示(从printf的格式串中学来的),数据包后面跟一个字节的校验和,这样就可以收发数据包了,具体程序如下:
int rcvpack(unsigned char * pData, int * pLength)
{
int ret;
int length;
unsigned char checksum;
int maxlength;
int status;
maxlength = *pLength + 1;
if (maxlength<=0)
return FAIL;
if (pData == NULL)
return FAIL;
checksum = 0;
length = 0;
status = 0;
while (1)
{
unsigned char ch;
int count;
count = 10;
while (1)
{
if ((ret = rcvbyte(&ch)) != OK)
{
count--;
if (count==0)
{
printf("/nReceive byte timeout/n");
return ret;
}
}
else
break;
}
switch (status)
{
case 0:
{
if (ch == '//')
{
status = 1;
}
}
break;
case 1:
{
if (ch == 'H')
status = 2;
else
status = 0;
}
break;
case 2:
{
if (ch == '//')
{
status = 3;
}
else
{
length ++;
if (length>maxlength)
{
printf("Buffer overflow(%d>%d)/n", length, maxlength);
return FAIL;
}
*pData++ = ch;
checksum += ch;
}
}
break;
case 3:
{
if (ch == '//')
{
length++;
if (length>maxlength)
{
printf("Buffer overflow (%d>%d)/n", length, maxlength);
return FAIL;
}
checksum += ch;
*pData++ = ch;
status = 2;
}
else
if (ch =='T')
{
unsigned char chk;
*pLength = length;
if (rcvbyte(&chk)!=OK)
return FAIL;
if (checksum==chk)
{
return OK;
}
else
{
printf("ERROR: Checksum is nozero(%d-%d)/n", checksum,chk);
return FAIL;
}
}
else
{
printf("ERROR: a '//' or 'T' expected('%c')!/n ", ch);
return FAIL;
}
}
}
}
}
int sendpack(unsigned char * pData, int length)
{
int ret;
unsigned char checksum;
checksum = 0;
if (length<=0)
return OK;
if ((ret = sendbyte('//')) != OK)
return 1;
if ((ret = sendbyte('H')) != OK)
return 2;
while (length>0)
{
unsigned char ch;
ch = *pData++;
checksum += ch;
if ((ret = sendbyte(ch)) != OK)
return 3;
if (ch == '//')
{
if ((ret = sendbyte(ch)) != OK)
return 4;
}
length--;
}
if ((ret = sendbyte('//')) != OK)
return 5;
if ((ret = sendbyte('T')) != OK)
return 6;
if ((ret = sendbyte(checksum)) != OK)
return 7;
return OK;
}
同样,也将rcvpack改成AT&T汇编(减少了几个printf语句):
chbuffer:
.byte 0
overflow:
.string "Buffer overflow..."
rcvpack:
pushw %bp
movw %sp, %bp
subw $12, %sp
pushw %si
movw 4(%bp), %si
movw 6(%bp), %bx
movw (%bx), %ax
incw %ax
movw %ax, -6(%bp)
cmpw $0, -6(%bp)
jg .L26
leaw overflow, %si
call prtstr
movw $2, %ax
jmp .L25
.L26:
orw %si, %si
jne .L27
movw $2, %ax
jmp .L25
.L27:
movb $0,-8(%bp)
movw $0, -10(%bp)
movw $0, -4(%bp)
jmp .L28
.L30:
movw $10, -2(%bp)
jmp .L31
.L33:
# movw -4(%bp), %ax
# addb $'0', %al
# call prtchr
leaw chbuffer, %ax
pushw %ax
call rcvbyte
popw %cx
movw %ax, -12(%bp)
orw %ax, %ax
je .L34
decw -2(%bp)
cmpw $0, -2(%bp)
jne .L35
movw -12(%bp), %ax
jmp .L25
.L35:
jmp .L36
.L34:
jmp .L32
.L36:
.L31:
jmp .L33
.L32:
pushw %si
leaw chbuffer, %si
movb (%si), %al
movb %al, -7(%bp)
popw %si
# call prtchr
movw -4(%bp), %ax
cmpw $3, %ax
jbe .L58
jmp .L56
.L58:
cmpw $0, %ax
je .L38
cmpw $1, %ax
je .L40
cmpw $2, %ax
je .L43
cmpw $3, %ax
je .L47
jmp .L56
.L38:
cmpb $92, -7(%bp)
jne .L39
movw $1, -4(%bp)
.L39:
jmp .L37
.L40:
cmpb $72, -7(%bp)
jne .L41
movw $2, -4(%bp)
jmp .L42
.L41:
movw $0, -4(%bp)
.L42:
jmp .L37
.L43:
cmpb $92, -7(%bp)
jne .L44
movw $3, -4(%bp)
jmp .L45
.L44:
incw -10(%bp)
movw -10(%bp), %ax
cmpw -6(%bp), %ax
jle .L46
movw $2, %ax
jmp .L25
.L46:
movb -7(%bp), %al
movb %al, (%si)
incw %si
movb -7(%bp), %al
addb %al, -8(%bp)
.L45:
jmp .L37
.L47:
cmpb $92, -7(%bp)
jne .L48
incw -10(%bp)
movw -10(%bp), %ax
cmpw -6(%bp), %ax
jle .L49
movw $2, %ax
jmp .L25
.L49:
movb -7(%bp), %al
addb %al, -8(%bp)
movb -7(%bp), %al
movb %al, (%si)
incw %si
movw $2, -4(%bp)
jmp .L50
.L48:
cmpb $84, -7(%bp)
jne .L51
movw -10(%bp), %ax
movw 6(%bp), %bx
movw %ax, (%bx)
leaw chbuffer, %ax
pushw %ax
call rcvbyte
popw %cx
orw %ax, %ax
je .L52
movw $2, %ax
jmp .L25
.L52:
movb -8(%bp), %al
cmpb chbuffer, %al
jne .L53
xorw %ax, %ax
jmp .L25
jmp .L54
sChecksumFailed:
.string "Checksum error!"
.L53:
leaw sChecksumFailed, %si
call prtstr
movw $2, %ax
jmp .L25
.L54:
jmp .L55
.L51:
movw $2, %ax
jmp .L25
.L55:
.L50:
.L56:
.L37:
.L28:
jmp .L30
.L29:
.L25:
popw %si
movw %bp, %sp
popw %bp
ret
好了,万事具备了,先用上面的c代码写另外一台计算机上的“服务”程序(也用来测试),这台计算机运行DOS,用TURBO C 2.0编译运行:
运行时将initrd.img和内核编译后的/usr/src/linux/arch/i386/boot/compressed/bvmlinux.out拷贝到该计算机的c:/下,然后带参数 s c:/bvmlinux.out c:/initrd.img运行即可。
至于启动程序,还得进行少许修改,才能烧到boot rom 中,见后面的说明。
int main(int argc, char* argv[])
{
FILE* pFile;
int count = 2;
if (argc<3)
{
printf("Usage testspp [s | r] /n");
return 1;
}
while(count {
if (argv[1][0] == 's')
pFile = fopen(argv[count], "rb");
else
pFile = fopen(argv[count], "wb");
if (pFile==NULL)
{
printf("Can't open/create file %s/n", argv[2]);
return 2;
}
if (argv[1][0]=='r')/*receive*/
{
unsigned long filesize;
char buffer[10244];
int length;
/*get filelength */
length = 10244;
printf("Receiving filesize package/n");
while( (rcvpack(buffer, &length)!=OK) && (length!=4))
length = 10244;
filesize = *(long*)buffer;
printf("file size is:%ld/n", filesize);
while (filesize>0)
{
length = 10244;
if (rcvpack(buffer, &length) != OK)
{
printf("Receive data package failed/n");
return 0;
}
if (length>0)
fwrite(buffer, 1, length, pFile);
filesize-=length;
printf("/r%ld Bytes Left ", filesize);
}
}
else/*send*/
{
unsigned long filesize;
/*send file length*/
unsigned long stemp;
int ret;
fseek(pFile, 0, 2);
filesize = ftell(pFile);
fseek(pFile, 0, 0);
printf("/nfile size is:%ld/n", filesize);
/*
while ((ret = sendpack((char *)&filesize, 4)) != OK)
{
printf("send file size failed(%d)/n", ret);
}
*/
while (filesize>0)
{
char buffer[10240];
long size;
int ret;
size = fread(buffer, 1, 10240, pFile);
if ((ret = sendpack(buffer, size)) != OK)
{
printf("Send data package failed(%d)/n", ret);
return 0;
}
filesize -= size;
printf("/r/t%ld Bytes Left", filesize);
}
}
fclose(pFile);
count++;
}/*while*/
return 0;
}
5、对bootsect.S的修改
目前的bootsect.S ,主要的问题是,它是从软盘上读数据,将这些代码换成对rcvpack的调用即可,另外,它不支持调入initrd,应该增加相应的代码。问题在于,bootsect.S中没有那么多空间来放rcvpack相关的代码(毕竟只有512字节,当然,如果烧在boot rom中,就不存在这个问题了,但是用软盘调试时就不行了,因此干脆编制load_kernel和load_initrd放在setup.S中,然后在bootsect.S中进行回调即可。
bootsect.S 修改如下(只给出修改部分):
.....
.....
ok_load_setup:
call kill_motor
call print_nl
# Now we will load kernel and initrd
loading:
# 先打印Loading字符
movw $INITSEG, %ax
movw %ax, %es # set up es
movb $0x03, %ah # read cursor pos
xorb %bh, %bh
int $0x10
movw $22, %cx
movw $0x0007, %bx # page 0, attribute 7 (normal)
movw $msg1, %bp
movw $0x1301, %ax # write string, move cursor
int $0x10 # tell the user we're loading..
load_kernel_img:
# 将load_kernel函数的指针放到0x22C处这里进行调用就行了(软盘启动过程中,此前已经将setup.S
# 从磁盘上调到bootsect.S,即0x0200之后,注意setup.S的头部是一张表,这里“提前”消费了)
# 0x22C is the load kernel routine
bootsect_readimage = 0x22C
lcall bootsect_readimage
load_initrd_img:
# 将load_initrd函数的指针放到0x220处
# 0x220 if the load initrd routine
bootsect_readinitrd = 0x220
lcall bootsect_readinitrd
# After that (everything loaded), we jump to the setup-routine
# loaded directly after the bootblock:
ljmp $SETUPSEG, $0
......
......
6、对setup.S的修改
对setup.S进行修改,主要是:修改setup.S头部,增加load_kernel和load_initrd函数等,具体如下。
修改setup.S头部如下(为好看,这里删除了原来的部分注释):
start:
jmp trampoline
.ascii "HdrS" # header signature
.word 0x0202 # header version number (>= 0x0105)
realmode_swtch: .word 0, 0 # default_switch, SETUPSEG
start_sys_seg: .word SYSSEG
.word kernel_version # pointing to kernel version string
type_of_loader: .byte 0
loadflags:
LOADED_HIGH = 1
.byte LOADED_HIGH # 只支持bzImage
setup_move_size: .word 0x8000
code32_start: # here loaders can put a different
.long 0x100000 # 0x100000 = default for big kernel
ramdisk_image: .long 0xB00000 # ramdisk 调到12M处
ramdisk_size: .long 0 # 由load_initrd来设置长度
bootsect_kludge:
.word load_initrd, SETUPSEG #0x220, 放置load_initrd函数的指针
heap_end_ptr: .word modelist+1024 pad1: .word 0
cmd_line_ptr: .long 0
load_kernel_call:
.word load_kernel, SETUPSEG
trampoline: call start_of_setup
.space 1024
load_kernel和load_initrd:
load_imsg:
.byte 13, 10
.string "Load INITRD from PARPort(378)"
load_kmsg:
.byte 13, 10
.string "Load Kernel From PARPort(378)"
reading_suc:
.string "."
reading_failed:
.string " failed"
read_len:
.word 0, 0
read_total:
.word 0, 0
read_buffer:
# 如何在AT&T语法中完成intel语法中的 db 1280 dup(0),那位请指教
# AT&T汇编的语法何处寻?
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
.string "012345678901234567890123456789012345678901234567890123456789"
load_initrd:
pushw %ds
pushw %es
pushw %cs
popw %ds
pushw %cs
popw %es
cld
leaw load_imsg, %si
call prtstr # 打印提示
movw $0x1000, %ax
movw %ax, %es
xorw %bx, %bx
movw $0x00B0, %ax # initrd数据先调到0x1000:0000处,
# 满64K即移动到12M(0xB00000)处
movw %ax, %fs
movw $0, %cs:move_es
movl $0, %cs:read_total
call movetohigh # 初始化数据移动部分
call .ld_img # 从并口上读入一个文件并移动到指定位置
movl %cs:read_total, %eax
movl %eax, %cs:ramdisk_size # 设置ramdisk_size和ramdisk_image
movl $0x00B00000, %eax
movl %eax, %cs:ramdisk_image
popw %es
popw %ds
lret
load_kernel:
pushw %ds
pushw %es
pushw %cs
popw %ds
pushw %cs
popw %es
cld
leaw load_kmsg, %si
call prtstr
movw $0x1000, %ax
movw %ax, %es
xorw %bx, %bx
movw $0x0010, %ax
movw %ax, %fs
movw $0, %cs:move_es
movl $0, %cs:read_total
call movetohigh
call .ld_img
popw %es
popw %ds
lret
.ld_img:
.ld_nextpack:
pushw %bx
pushw %es
leaw read_len, %si
movw $1124, %ax
movw %ax, (%si)
pushw %si
leaw read_buffer, %ax
pushw %ax
movw %bx, %ax
call rcvpack # 调用rcpack接收一个数据包read_buffer中
popw %cx
popw %cx
popw %es
popw %bx
cmpw $0, %ax # 成功?
je .ld_suc
leaw reading_failed, %si
call prtstr
.ld_panic:
jmp .ld_panic # 失败则死循环
.ld_suc:
leaw read_buffer, %si
movw %bx, %di
movw $256, %cx # move 1024 bytes
rep
movsl # 从read_buffer移动到es:bx处,强制假定一个数据包长度
# 就是1024字节,最后一个数据包除外。
addw $1024, %bx # 更新bx, 如果bx加到零,则表示已经满64K,后面的调用中
call movetohigh # 进行实际的数据移动
movw %ax, %dx #
cmpw $0, %ax # 如果进行了64K数据移动,就打印一个'.'
je .ld_1
leaw reading_suc, %si
call prtstr
.ld_1:
leaw read_len, %si
xorl %eax, %eax
movw (%si), %ax
addl %eax, %cs:read_total
cmpw $1024, %ax # 更新收到数据总字节数,如果收到的字节数少于1024,则表示
# 收到最后一个数据包,这得冒点风险,万一最后一个数据包刚好
# 是1024字节,怎么办好呢?赌一把吧!
jb .ld_lastestpack
jmp .ld_nextpack # 接着接收下一个数据包
.ld_lastestpack:
# 最后一个数据包收到后,不见得满64K,此时应该强制数据移动
cmpw $0, %dx
jne .ld_exit
xorw %bx, %bx
call movetohigh
.ld_exit:
ret
7、用软盘进行调试,将启动程序烧到bootrom中
好了,大功告成,对内核进行配置,然后make bzImage,将bvmlinux.out拷贝到“服务器”上,建立initrd也放在“服务器”上,然后放张软盘在软驱中,dd if=/usr/src/linux/arch/i386/boot/bzImage of=/dev/fd0 count=32将bootsect.S+setup.S部分拷贝到软盘上,重新启动(先连接好并口线)。启动后再在“服务器”上启动文件“服务”程序,终于可以将Linux从并口上启动了!
做少量调整(主要是去掉读setup.S部分的代码),即可以将此bzImage的前8(16?)K写在一个文件中,处理成boot rom映象,烧到boot rom中,插到网络卡上,启动机器即可。这就是用网络卡从并口上启动Linux。
标题 Re: 用网络卡从并口上启动Linux(I386) [re: raoxianhong]
作者 raoxianhong (journeyman)
时间 10/09/01 11:30 AM
网络上说可以将Bootrom写到BIOS中去,但是没有实验成功,不知道有什么讲究,哪位可曾试过?
寻找文件 cbrom.pdf
标题 推荐两篇讲述启动过程的文章 [re: feiyunw]
作者 raoxianhong (journeyman)
时间 10/11/01 09:08 AM
http://www.pcguide.com/ref/mbsys/bios/boot.htm
http://www2.csa.iisc.ernet.in/~kvs/LinuxBoot.html
标题 Re: 386 boot代码分析 [re: feiyunw]
作者 raoxianhong (member)
时间 10/25/01 05:09 PM
附加文件 181431-bootrom.zip
有几位老兄Mail问网卡启动的启动代码问题,这里总结如下:
1.系统自检完毕后在ROM空间中找(好象是2Kbytes为单位),如果某一段的前两表字节是0x55AA,那么第三个字节作为ROM程序的大小(512字节为单位)。然后将该段空间中的所有字节相加(计算校验和),结果为零时表示ROM程序有效。此时BIOS用一个长调用(lcall),调用该块的第四个字节起始处(自然该用lret返回)。
2.有个问题原来一直不明白,如果此时某个启动网卡启动系统,但是后面还有带ROM的卡(比如PCI),那么该段ROM程序岂不是没有机会运行了吗,当然,如果不运行任何设备的扩展ROM,不知道Linux内会不会有问题!后来查资料得知,实际上制作网卡启动程序时还没有这么简单。
3.事实上,系统在自检及运行所有的扩展硬件检测之后,是用int 19h启动操作系统的!因此在扩展ROM中不直接启动操作系统,而是将操作系统启动代码作为int 19h的中断调用(其实也不用返回,操作系统没有必要返回)代码就行了。
明白这一点后,制作一个网卡启动程序就容易多了,具体请看某个网卡的启动源代码即可,附件中有一个,记不住是从哪里抄来的了!
标题 通用的网络卡bootrom处理程序 [re: feiyunw]
作者 raoxianhong (member)
时间 12/06/01 08:05 PM
Bootrom写好后要进行一些处理才能烧到EPROM中去。这里提供一段代码可以完成这个功能,上面讲的用并口启动Linux的程序就是这么处理的。
基本的想法是,写一个通用的启动代码载入程序(stub),将bootsect.S+setup.S(也就是bzImage的前面一段)设置成0x19号中断的中断向量。在外面写一段代码将该段代码和启动代码进行合并,生成合法的bootrom映象就,可以烧到bootrom中去,在网络卡上启动。
下面是通用的启动代码载入程序:
.code16
RomHeader:
.byte 0x55, 0xaa #启动ROM标志
RomPageCount:
.byte 0x20 #假定bootrom是16K bytes
RomCode:
pushw %es
pushw %bx
pushw %ax
movb $0xc1, %al
call IntVectAddr
movw $0x6a6e, %ax
cmpw %es:(%bx), %ax
jz RomBootInit_x
movw %ax, %es:(%bx)
movw $0xc019, %ax
call MoveIntVector
movw $RomBootVect, %bx
pushw %cs
popw %es
call SetIntVector
RomBootInit_x:
popw %ax
popw %bx
popw %es
lret
IntVectAddr:
xorw %bx,%bx
movw %bx,%es
movb %al,%bl
addw %bx,%bx
addw %bx,%bx
ret
GetIntVector:
call IntVectAddr
GetIntVect_1:
les %es:(%bx), %bx
ret
SetIntVector:
pushf #; entry AL=vector to set, ES:BX=value
pushw %es #; exit: vector modified
pushw %bx #; all registers preserved
call IntVectAddr
cli
popw %es:(%bx)
addw $2, %bx
popw %es:(%bx)
subw $2, %bx
popf
jmp GetIntVect_1
MoveIntVector:
call GetIntVector #; entry AL=vect to get, AH=vect to set
xchgb %al,%ah #; exit: vector set, ES:BX=vector value
call SetIntVector #; other registers preserved
xchgb %al,%ah
ret
RomBootVect:
pushw %cs
popw %ds
movw $0x07c0, %ax
movw %ax, %es
movw $BootCode, %si
subw %di, %di
movw $8192, %cx
cld
rep
movsw
ljmp $0x07c0, $0
lret
.org 0x0200
BootCode:
在Linux下的编译方法与bootsect.S的编译方法一样,编译成可执行文件后,比如放在bootx文件中。
内核编译后(make bzImage,支持上面所说的启动方式),得到bzImage文件。
下面是将这两个文件复合在一起得到bootrom映象的程序:
/* mkbtrom.c */
int main(int argc, char* argv[])
{
char buf[16384];
char ch;
int i;
if (argc<4)
{
printf("Usage: mkbtrom /n");
return 1;
}
FILE * pFile;
pFile = fopen(argv[1], "rb");
if (pFile==NULL)
{
printf("File %s open failed/n", argv[1]);
return 2;
}
fread(buf, 1, 512, pFile);
fclose(pFile);
pFile = fopen(argv[2], "rb");
if (pFile==NULL)
{
printf("File %s open failed/n", argv[2]);
return 2;
}
fread(&buf[512], 1, 16384-512-1, pFile);
fclose(pFile);
ch = 0;
for (i = 0;i<18383;i++)
ch += buf[ i ];
buf[16383] = -ch;
pFile = fopen(argv[3], "wb");
fwrite(buf, 1, 16384, pFile);
fclose(pFile);
return 0;
}
编译成执行文件后,运行mkbtrom bootx bzImage boot16k.bin后,boot16k.bin就可以烧到eprom中,从网络卡中启动了。