树莓派上关于ARM指令的验证

made by Rk

本文由浙江大学《嵌入式系统》课程提供强力支持。

感谢翁恺老师 @翁恺BA5AG

/*************************************************************/

通过C代码和反汇编工具研究以下几点:

生成了Thumb指令还是ARM指令,如何通过编译参数改变;
对于ARM指令,能否产生条件执行的指令;
设计C的代码场景,观察是否产生了寄存器移位寻址;
设计C的代码场景,观察一个复杂的32位数是如何装载到寄存器的;
写一个C的多重函数调用的程序,观察和分析:
调用时的返回地址在哪里?
传入的参数在哪里?
本地变量的堆栈分配是如何做的?
寄存器是caller保存还是callee保存?是全体保存还是部分保存?
MLA是带累加的乘法,尝试要如何写C的表达式能编译得到MLA指令。


为了实验代码的完整可用,新建一个文件夹arm。



生成了Thumb指令还是ARM指令,如何通过编译参数改变

1、编写下列指令:
#include <stdio.h>
int main(){
        int t = 0;
        t++;
        return 0;
}

2、编译
gcc -c arm_1.c
objdump -d arm_1.o

可以看到结果如下:
pi@raspberrypi ~/code/arm $ objdump -d arm_1.o

arm_1.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <main>:
   0:	e52db004 	push	{fp}		; (str fp, [sp, #-4]!)
   4:	e28db000 	add	fp, sp, #0
   8:	e24dd00c 	sub	sp, sp, #12
   c:	e3a03000 	mov	r3, #0
  10:	e50b3008 	str	r3, [fp, #-8]
  14:	e51b3008 	ldr	r3, [fp, #-8]
  18:	e2833001 	add	r3, r3, #1
  1c:	e50b3008 	str	r3, [fp, #-8]
  20:	e3a03000 	mov	r3, #0
  24:	e1a00003 	mov	r0, r3
  28:	e28bd000 	add	sp, fp, #0
  2c:	e8bd0800 	pop	{fp}
  30:	e12fff1e 	bx	lr

指令程度为32位,gcc默认使用arm指令集。

3、使用Thumb编译
gcc -c -mthumb -msoft-float arm_1.c 
查看编译生成结果:
pi@raspberrypi ~/code/arm $ objdump -d arm_1.o

arm_1.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <main>:
   0:	b580      	push	{r7, lr}
   2:	b082      	sub	sp, #8
   4:	af00      	add	r7, sp, #0
   6:	2300      	movs	r3, #0
   8:	607b      	str	r3, [r7, #4]
   a:	687b      	ldr	r3, [r7, #4]
   c:	3301      	adds	r3, #1
   e:	607b      	str	r3, [r7, #4]
  10:	2300      	movs	r3, #0
  12:	1c18      	adds	r0, r3, #0
  14:	46bd      	mov	sp, r7
  16:	b002      	add	sp, #8
  18:	bd80      	pop	{r7, pc}
  1a:	46c0      	nop			; (mov r8, r8)

指令长度是16位(最左一列以2Byte递增)

对于ARM指令,能否产生条件执行的指令;

1、编写arm_branch.c源文件:
#include <stdio.h>

int f(int a, int b) {
	int t;
	if (a > b)
		t = a - b--;
	if (a == b - 10)
		t = a + b++;
	return t;
}

int main() {
	f(10, 20);
	return 0;
}

2、查看编译结果
pi@raspberrypi ~/code/arm $ objdump -d arm_branch.o

arm_branch.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <f>:
   0:	e52db004 	push	{fp}		; (str fp, [sp, #-4]!)
   4:	e28db000 	add	fp, sp, #0
   8:	e24dd014 	sub	sp, sp, #20
   c:	e50b0010 	str	r0, [fp, #-16]
  10:	e50b1014 	str	r1, [fp, #-20]
  14:	e51b2010 	ldr	r2, [fp, #-16]
  18:	e51b3014 	ldr	r3, [fp, #-20]
  1c:	e1520003 	cmp	r2, r3
  20:	da000006 	ble	40 <f+0x40>
  24:	e51b2010 	ldr	r2, [fp, #-16]
  28:	e51b3014 	ldr	r3, [fp, #-20]
  2c:	e0633002 	rsb	r3, r3, r2
  30:	e50b3008 	str	r3, [fp, #-8]
  34:	e51b3014 	ldr	r3, [fp, #-20]
  38:	e2433001 	sub	r3, r3, #1
  3c:	e50b3014 	str	r3, [fp, #-20]
  40:	e51b3014 	ldr	r3, [fp, #-20]
  44:	e243200a 	sub	r2, r3, #10
  48:	e51b3010 	ldr	r3, [fp, #-16]
  4c:	e1520003 	cmp	r2, r3
  50:	1a000006 	bne	70 <f+0x70>
  54:	e51b2014 	ldr	r2, [fp, #-20]
  58:	e51b3010 	ldr	r3, [fp, #-16]
  5c:	e0823003 	add	r3, r2, r3
  60:	e50b3008 	str	r3, [fp, #-8]
  64:	e51b3014 	ldr	r3, [fp, #-20]
  68:	e2833001 	add	r3, r3, #1
  6c:	e50b3014 	str	r3, [fp, #-20]
  70:	e51b3008 	ldr	r3, [fp, #-8]
  74:	e1a00003 	mov	r0, r3
  78:	e28bd000 	add	sp, fp, #0
  7c:	e8bd0800 	pop	{fp}
  80:	e12fff1e 	bx	lr

00000084 <main>:
  84:	e92d4800 	push	{fp, lr}
  88:	e28db004 	add	fp, sp, #4
  8c:	e3a0000a 	mov	r0, #10
  90:	e3a01014 	mov	r1, #20
  94:	ebfffffe 	bl	0 <f>
  98:	e3a03000 	mov	r3, #0
  9c:	e1a00003 	mov	r0, r3
  a0:	e8bd8800 	pop	{fp, pc}

可以看到bne等条件分支指令的出现,可见ARM是支持的条件执行指令的。


设计C的代码场景,观察是否产生了寄存器移位寻址;

1、编译生成测试代码:
#include <stdio.h>
int main() {
        int t[10];
        int i = 0;
        int j = 0;
        for(j=0; j<4; j++) {
                t[8] = t[i*2];
        }
        printf("%d", t[8]);
        
        return 0;
}

2、查看编译结果:
pi@raspberrypi ~/code/arm $ objdump -d arm_shift.o

arm_shift.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <main>:
   0:	e92d4800 	push	{fp, lr}
   4:	e28db004 	add	fp, sp, #4
   8:	e24dd030 	sub	sp, sp, #48	; 0x30
   c:	e3a03000 	mov	r3, #0
  10:	e50b300c 	str	r3, [fp, #-12]
  14:	e3a03000 	mov	r3, #0
  18:	e50b3008 	str	r3, [fp, #-8]
  1c:	e3a03000 	mov	r3, #0
  20:	e50b3008 	str	r3, [fp, #-8]
  24:	ea00000b 	b	58 <main+0x58>
  28:	e51b300c 	ldr	r3, [fp, #-12]
  2c:	e1a02083 	lsl	r2, r3, #1
  30:	e3e0302f 	mvn	r3, #47	; 0x2f
  34:	e1a02102 	lsl	r2, r2, #2
  38:	e24b1004 	sub	r1, fp, #4
  3c:	e0812002 	add	r2, r1, r2
  40:	e0823003 	add	r3, r2, r3
  44:	e5933000 	ldr	r3, [r3]
  48:	e50b3014 	str	r3, [fp, #-20]
  4c:	e51b3008 	ldr	r3, [fp, #-8]
  50:	e2833001 	add	r3, r3, #1
  54:	e50b3008 	str	r3, [fp, #-8]
  58:	e51b3008 	ldr	r3, [fp, #-8]
  5c:	e3530003 	cmp	r3, #3
  60:	dafffff0 	ble	28 <main+0x28>
  64:	e59f201c 	ldr	r2, [pc, #28]	; 88 <main+0x88>
  68:	e51b3014 	ldr	r3, [fp, #-20]
  6c:	e1a00002 	mov	r0, r2
  70:	e1a01003 	mov	r1, r3
  74:	ebfffffe 	bl	0 <printf>
  78:	e3a03000 	mov	r3, #0
  7c:	e1a00003 	mov	r0, r3
  80:	e24bd004 	sub	sp, fp, #4
  84:	e8bd8800 	pop	{fp, pc}
  88:	00000000 	.word	0x00000000

可以看到其中有左移指令:
lsl	r2, r3, #1


设计C的代码场景,观察一个复杂的32位数是如何装载到寄存器的;

1、编写测试代码
#include <stdio.h>
int main() {
        unsigned int a = 0x12345678;
        a++;
        return 0;
}


2、编译查看结果:
pi@raspberrypi ~/code/arm $ objdump -d arm_big_num.o

arm_big_num.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <main>:
   0:	e52db004 	push	{fp}		; (str fp, [sp, #-4]!)
   4:	e28db000 	add	fp, sp, #0
   8:	e24dd00c 	sub	sp, sp, #12
   c:	e59f3020 	ldr	r3, [pc, #32]	; 34 <main+0x34>
  10:	e50b3008 	str	r3, [fp, #-8]
  14:	e51b3008 	ldr	r3, [fp, #-8]
  18:	e2833001 	add	r3, r3, #1
  1c:	e50b3008 	str	r3, [fp, #-8]
  20:	e3a03000 	mov	r3, #0
  24:	e1a00003 	mov	r0, r3
  28:	e28bd000 	add	sp, fp, #0
  2c:	e8bd0800 	pop	{fp}
  30:	e12fff1e 	bx	lr
  34:	12345678 	.word	0x12345678

可以看到ARM指令将大数字存在命令段,通过ldr指令去加载,而非MIPS等指令集使用lui和ori来实现32位大数字的实现。

写一个C的多重函数调用的程序,观察和分析:

调用时的返回地址在哪里?

传入的参数在哪里?

本地变量的堆栈分配是如何做的?

寄存器是caller保存还是callee保存?是全体保存还是部分保存?

1、编写测试代码
#include <stdio.h>
int f2(int a, int b, int c, int x, int y, int z, int p, int q, int r) {
        int temp = 0;
        temp = a*x + b*y + c*z + p*q +r;
        return temp;
}


int f3(int a,int b,int c,int d,int e,int f,int g,int h,int i,int j,int k)  
{  
    int t1=a+b;  
    int t2=c+d;  
  
    int t3=e+f;  
    int t4=g+h;  
    int t5=i+j;  
  
    f2(1,2,3,4,5,6,7,8,9);  
    int t6=t1*t2;  
    int t7=t3*t4;  
    int t8=t6-t7;  
    int t9=t8*t5*k;  
  
    return t9;
}  


int f1(int a, int b, int c) {
        int temp = 0;
        temp = f2(a, b, c, 4, 5, 6, 7, 8, 9);
        return temp;
}


int main() {
        int a,b,c;
        a = 0;
        b = 1;
        c = 2;
        printf("%d", f1(a,b,c));
        f3(11,10,9,8,7,6,5,4,3,2,1);
        return 0;
}

2、编译查看结果
pi@raspberrypi ~/code/arm $ gcc -c arm_multi_call.c
pi@raspberrypi ~/code/arm $ objdump -d arm_multi_call.o


arm_multi_call.o:     file format elf32-littlearm




Disassembly of section .text:


00000000 <f2>:
   0:	e52db004 	push	{fp}		; (str fp, [sp, #-4]!)
   4:	e28db000 	add	fp, sp, #0
   8:	e24dd01c 	sub	sp, sp, #28
   c:	e50b0010 	str	r0, [fp, #-16]
  10:	e50b1014 	str	r1, [fp, #-20]
  14:	e50b2018 	str	r2, [fp, #-24]
  18:	e50b301c 	str	r3, [fp, #-28]
  1c:	e3a03000 	mov	r3, #0
  20:	e50b3008 	str	r3, [fp, #-8]
  24:	e51b3010 	ldr	r3, [fp, #-16]
  28:	e51b201c 	ldr	r2, [fp, #-28]
  2c:	e0020392 	mul	r2, r2, r3
  30:	e51b3014 	ldr	r3, [fp, #-20]
  34:	e59b1004 	ldr	r1, [fp, #4]
  38:	e0030391 	mul	r3, r1, r3
  3c:	e0822003 	add	r2, r2, r3
  40:	e59b300c 	ldr	r3, [fp, #12]
  44:	e59b1010 	ldr	r1, [fp, #16]
  48:	e0010391 	mul	r1, r1, r3
  4c:	e51b3018 	ldr	r3, [fp, #-24]
  50:	e59b0008 	ldr	r0, [fp, #8]
  54:	e0030390 	mul	r3, r0, r3
  58:	e0813003 	add	r3, r1, r3
  5c:	e0822003 	add	r2, r2, r3
  60:	e59b3014 	ldr	r3, [fp, #20]
  64:	e0823003 	add	r3, r2, r3
  68:	e50b3008 	str	r3, [fp, #-8]
  6c:	e51b3008 	ldr	r3, [fp, #-8]
  70:	e1a00003 	mov	r0, r3
  74:	e28bd000 	add	sp, fp, #0
  78:	e8bd0800 	pop	{fp}
  7c:	e12fff1e 	bx	lr


00000080 <f3>:
  80:	e92d4800 	push	{fp, lr}
  84:	e28db004 	add	fp, sp, #4
  88:	e24dd050 	sub	sp, sp, #80	; 0x50
  8c:	e50b0030 	str	r0, [fp, #-48]	; 0x30
  90:	e50b1034 	str	r1, [fp, #-52]	; 0x34
  94:	e50b2038 	str	r2, [fp, #-56]	; 0x38
  98:	e50b303c 	str	r3, [fp, #-60]	; 0x3c
  9c:	e51b2030 	ldr	r2, [fp, #-48]	; 0x30
  a0:	e51b3034 	ldr	r3, [fp, #-52]	; 0x34
  a4:	e0823003 	add	r3, r2, r3
  a8:	e50b3008 	str	r3, [fp, #-8]
  ac:	e51b2038 	ldr	r2, [fp, #-56]	; 0x38
  b0:	e51b303c 	ldr	r3, [fp, #-60]	; 0x3c
  b4:	e0823003 	add	r3, r2, r3
  b8:	e50b300c 	str	r3, [fp, #-12]
  bc:	e59b2004 	ldr	r2, [fp, #4]
  c0:	e59b3008 	ldr	r3, [fp, #8]
  c4:	e0823003 	add	r3, r2, r3
  c8:	e50b3010 	str	r3, [fp, #-16]
  cc:	e59b200c 	ldr	r2, [fp, #12]
  d0:	e59b3010 	ldr	r3, [fp, #16]
  d4:	e0823003 	add	r3, r2, r3
  d8:	e50b3014 	str	r3, [fp, #-20]
  dc:	e59b2014 	ldr	r2, [fp, #20]
  e0:	e59b3018 	ldr	r3, [fp, #24]
  e4:	e0823003 	add	r3, r2, r3
  e8:	e50b3018 	str	r3, [fp, #-24]
  ec:	e3a03005 	mov	r3, #5
  f0:	e58d3000 	str	r3, [sp]
  f4:	e3a03006 	mov	r3, #6
  f8:	e58d3004 	str	r3, [sp, #4]
  fc:	e3a03007 	mov	r3, #7
 100:	e58d3008 	str	r3, [sp, #8]
 104:	e3a03008 	mov	r3, #8
 108:	e58d300c 	str	r3, [sp, #12]
 10c:	e3a03009 	mov	r3, #9
 110:	e58d3010 	str	r3, [sp, #16]
 114:	e3a00001 	mov	r0, #1
 118:	e3a01002 	mov	r1, #2
 11c:	e3a02003 	mov	r2, #3
 120:	e3a03004 	mov	r3, #4
 124:	ebfffffe 	bl	0 <f2>
 128:	e51b3008 	ldr	r3, [fp, #-8]
 12c:	e51b200c 	ldr	r2, [fp, #-12]
 130:	e0030392 	mul	r3, r2, r3
 134:	e50b301c 	str	r3, [fp, #-28]
 138:	e51b3010 	ldr	r3, [fp, #-16]
 13c:	e51b2014 	ldr	r2, [fp, #-20]
 140:	e0030392 	mul	r3, r2, r3
 144:	e50b3020 	str	r3, [fp, #-32]
 148:	e51b201c 	ldr	r2, [fp, #-28]
 14c:	e51b3020 	ldr	r3, [fp, #-32]
 150:	e0633002 	rsb	r3, r3, r2
 154:	e50b3024 	str	r3, [fp, #-36]	; 0x24
 158:	e51b3024 	ldr	r3, [fp, #-36]	; 0x24
 15c:	e51b2018 	ldr	r2, [fp, #-24]
 160:	e0030392 	mul	r3, r2, r3
 164:	e59b201c 	ldr	r2, [fp, #28]
 168:	e0030392 	mul	r3, r2, r3
 16c:	e50b3028 	str	r3, [fp, #-40]	; 0x28
 170:	e51b3028 	ldr	r3, [fp, #-40]	; 0x28
 174:	e1a00003 	mov	r0, r3
 178:	e24bd004 	sub	sp, fp, #4
 17c:	e8bd8800 	pop	{fp, pc}


00000180 <f1>:
 180:	e92d4800 	push	{fp, lr}
 184:	e28db004 	add	fp, sp, #4
 188:	e24dd030 	sub	sp, sp, #48	; 0x30
 18c:	e50b0010 	str	r0, [fp, #-16]
 190:	e50b1014 	str	r1, [fp, #-20]
 194:	e50b2018 	str	r2, [fp, #-24]
 198:	e3a03000 	mov	r3, #0
 19c:	e50b3008 	str	r3, [fp, #-8]
 1a0:	e3a03005 	mov	r3, #5
 1a4:	e58d3000 	str	r3, [sp]
 1a8:	e3a03006 	mov	r3, #6
 1ac:	e58d3004 	str	r3, [sp, #4]
 1b0:	e3a03007 	mov	r3, #7
 1b4:	e58d3008 	str	r3, [sp, #8]
 1b8:	e3a03008 	mov	r3, #8
 1bc:	e58d300c 	str	r3, [sp, #12]
 1c0:	e3a03009 	mov	r3, #9
 1c4:	e58d3010 	str	r3, [sp, #16]
 1c8:	e51b0010 	ldr	r0, [fp, #-16]
 1cc:	e51b1014 	ldr	r1, [fp, #-20]
 1d0:	e51b2018 	ldr	r2, [fp, #-24]
 1d4:	e3a03004 	mov	r3, #4
 1d8:	ebfffffe 	bl	0 <f2>
 1dc:	e50b0008 	str	r0, [fp, #-8]
 1e0:	e51b3008 	ldr	r3, [fp, #-8]
 1e4:	e1a00003 	mov	r0, r3
 1e8:	e24bd004 	sub	sp, fp, #4
 1ec:	e8bd8800 	pop	{fp, pc}


000001f0 <main>:
 1f0:	e92d4810 	push	{r4, fp, lr}
 1f4:	e28db008 	add	fp, sp, #8
 1f8:	e24dd034 	sub	sp, sp, #52	; 0x34
 1fc:	e3a03000 	mov	r3, #0
 200:	e50b3010 	str	r3, [fp, #-16]
 204:	e3a03001 	mov	r3, #1
 208:	e50b3014 	str	r3, [fp, #-20]
 20c:	e3a03002 	mov	r3, #2
 210:	e50b3018 	str	r3, [fp, #-24]
 214:	e59f4078 	ldr	r4, [pc, #120]	; 294 <main+0xa4>
 218:	e51b0010 	ldr	r0, [fp, #-16]
 21c:	e51b1014 	ldr	r1, [fp, #-20]
 220:	e51b2018 	ldr	r2, [fp, #-24]
 224:	ebfffffe 	bl	180 <f1>
 228:	e1a03000 	mov	r3, r0
 22c:	e1a00004 	mov	r0, r4
 230:	e1a01003 	mov	r1, r3
 234:	ebfffffe 	bl	0 <printf>
 238:	e3a03007 	mov	r3, #7
 23c:	e58d3000 	str	r3, [sp]
 240:	e3a03006 	mov	r3, #6
 244:	e58d3004 	str	r3, [sp, #4]
 248:	e3a03005 	mov	r3, #5
 24c:	e58d3008 	str	r3, [sp, #8]
 250:	e3a03004 	mov	r3, #4
 254:	e58d300c 	str	r3, [sp, #12]
 258:	e3a03003 	mov	r3, #3
 25c:	e58d3010 	str	r3, [sp, #16]
 260:	e3a03002 	mov	r3, #2
 264:	e58d3014 	str	r3, [sp, #20]
 268:	e3a03001 	mov	r3, #1
 26c:	e58d3018 	str	r3, [sp, #24]
 270:	e3a0000b 	mov	r0, #11
 274:	e3a0100a 	mov	r1, #10
 278:	e3a02009 	mov	r2, #9
 27c:	e3a03008 	mov	r3, #8
 280:	ebfffffe 	bl	80 <f3>
 284:	e3a03000 	mov	r3, #0
 288:	e1a00003 	mov	r0, r3
 28c:	e24bd008 	sub	sp, fp, #8
 290:	e8bd8810 	pop	{r4, fp, pc}
 294:	00000000 	.word	0x00000000



可以发现:
函数调用后的返回地址存放在LR寄存器中。
传入的参数存放在R0 R1 R2 R3四个寄存器中,多余的参数放在堆栈中。
本地变量存放在堆栈高地址,传进来的参数存放在堆栈低地址。
R0到R3由caller保存,R4以上由callee保存。

MLA是带累加的乘法,尝试要如何写C的表达式能编译得到MLA指令。

1、编写函数如下:
#include <stdio.h>

int f1(int a, int b, int c) {
        return a*b + c;
}

int main() {
        f1(1,2,3);
        return 0;
}

2、编译查看结果:
树莓派上关于ARM指令的验证_第1张图片

可以看到MLA命令得到了使用。

你可能感兴趣的:(ARM)