bus error的解决方法

bus error的解决方法

在x86+Linux上写的程序,在PC机上运行得很好。可是使用ARM的gcc进行交叉编译,再送到DaVinci目标板上运行的时候,出现了Bus error。
出现的位置如下(其中Debug的内容是我在程序中添加的调试信息):
[email protected]:~# arm_v5t_le-gcc -g shit.c
[email protected]:~# ./a.out
Debug: malloc space for the actual data: temp_buf = 0x13118
Debug: in my_recvn()
Debug: nleft = 52
Bus error
打开调试器进行调试:
[email protected]:~# gdb a.out
GNU gdb 6.3 (MontaVista 6.3-20.0.22.0501131 2005-07-22)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "armv5tl-montavista-linuxeabi"...Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) run // 运行程序
Starting program: /home/zpf/a.out
Debug: in get_program_info()
Debug: in conn_server(char *err_buf_ptr)
Debug: gonna create a socket
Debug: gonna fill in the serv_addr structure
Debug: gonna connect to a server
Debug: gonna send LIN_RST
Debug: in my_sendn()
Debug: send 4 bytes to server: 
Debug: gonna receive LIN_RSP
Debug: in my_recvn()
Debug: nleft = 3
Debug: received first 3 bytes from server: 7
Debug: gonna check if 3rd byte is the package type
Debug: received package length = 55
Debug: malloc space for the actual data: temp_buf = 0x13118
Debug: in my_recvn()
Debug: nleft = 52

Program received signal SIGBUS, Bus error. // 在这里出现了错误
0x00009624 in alloc_prog_mem (detail_buf=0x13118 "\001\002",
err_buf_ptr=0xbefffc40 "") at shit.c:631
631 g_data_ptr->progtype_num = *(short *)ptr ;
(gdb) print ptr // 查看一下ptr的值
$1 = 0x13119 "\002" // 地址起始是奇数!!!
(gdb) set ptr=0x1311a // 想改一下
(gdb) continue
Continuing.

Program terminated with signal SIGBUS, Bus error.
The program no longer exists. // 可惜程序已经退出
(gdb) quit

其中,g_data_ptr->progtype_num是一个short类型的值。
把强制类型转换改为用memcpy()写值之后,再调试
[email protected]:~# gdb test
GNU gdb 6.3 (MontaVista 6.3-20.0.22.0501131 2005-07-22)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "armv5tl-montavista-linuxeabi"...Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) break 626 // 把刚刚的那句强制类型转换变成内存拷贝
Breakpoint 1 at 0x9630: file test.c, line 626.
(gdb) run
Starting program: /home/zpf/test
Debug: in get_program_info()
Debug: in conn_server(char *err_buf_ptr)
Debug: gonna create a socket
Debug: gonna fill in the serv_addr structure
Debug: gonna connect to a server
Debug: gonna send LIN_RST
Debug: in my_sendn()
Debug: send 4 bytes to server: 
Debug: gonna receive LIN_RSP
Debug: in my_recvn()
Debug: nleft = 3
Debug: received first 3 bytes from server: 7
Debug: gonna check if 3rd byte is the package type
Debug: received package length = 55
Debug: malloc space for the actual data: temp_buf = 0x13118
Debug: in my_recvn()
Debug: nleft = 52

Breakpoint 1, alloc_prog_mem (detail_buf=0x13118 "\001\002",
err_buf_ptr=0xbefffc40 "") at test.c:626
warning: Source file is more recent than executable.

626 memcpy(&(g_data_ptr->prog_num), ptr, 2) ; // 在这一句中断
(gdb) print ptr // 再看看ptr
$1 = 0x1311b "\003" // 还是奇数地址
(gdb) continue // 继续执行
Continuing.
Debug: sum_progtype = 2 , sum_prog = 3
Debug: gonna malloc space for progtype_ptr
Debug: gonna malloc space for prog_ptr
Debug: in mv_pkg2prog_list()
Debug: gonna set ProgramType program_type_name
Debug: ProgramType program_type_name set OK
Debug: in $ == *ptr, j = 0
Debug: g_data_ptr->progtype_ptr[j].prog_ptr = temp_prog_ptr
Debug: in @ == *ptr, ptr = 0x13126
Debug: temp_prog_ptr->format = *ptr
Debug: temp_prog_ptr->program_id = *(int *)ptr
Debug: gonna set Program program_name
Debug: Program program_name set OK
Debug: finished one loop of while
Debug: in @ == *ptr, ptr = 0x1312f
Debug: temp_prog_ptr->format = *ptr
Debug: temp_prog_ptr->program_id = *(int *)ptr
Debug: gonna set Program program_name
Debug: Program program_name set OK
Debug: finished one loop of while
Debug: gonna set ProgramType program_type_name
Debug: ProgramType program_type_name set OK
Debug: in $ == *ptr, j = 1
Debug: g_data_ptr->progtype_ptr[j].prog_ptr = temp_prog_ptr
Debug: in @ == *ptr, ptr = 0x13142
Debug: temp_prog_ptr->format = *ptr
Debug: temp_prog_ptr->program_id = *(int *)ptr
Debug: gonna set Program program_name
Debug: Program program_name set OK
Debug: finished one loop of while
program type[0]
program_type_id = 1
program_type_name = love
program_num = 2
prog_ptr = 0x131d8
program[0]
program_id = 1001
program_name = you
format = 1
program[1]
program_id = 1002
program_name = me
format = 2
program type[1]
program_type_id = 2
program_type_name = hatred
program_num = 1
prog_ptr = 0x13248
program[0]
program_id = 2005
program_name = kill
format = 5
Debug: gonna return an OK
Debug: Entering send_exit_requstion()
Debug: in conn_server(char *err_buf_ptr)
Debug: gonna create a socket
Debug: gonna fill in the serv_addr structure
Debug: gonna connect to a server
Debug: gonna send EXIT_RST
Debug: in my_sendn()
Debug: send 4 bytes to server: 
Debug: in my_recvn()
Debug: nleft = 4
Debug: gonna return an OK

Program exited normally. // 执行通过了!!!!
(gdb)

总结:
问题总算找到了,就是我企图在一个奇数地址起始的地方强制类型转换得到一个short值。
在Intel系列处理器上,可以在任一奇数内存地址储存任何变量或数组,不会导致任何致命的错误影响,只是效率可能会降低。但在DaVinci上,这一点不行。所以必须对大于一个字节的数据类型小心谨慎,比较安全的方法是使用内存拷贝函数memcpy(),或者使用下面的代替方法:
// 先定义一个联合体
union {
short short_val ;
char short_byte[2] ;
} myshort ;
// 然后,把程序中本来应该是
// g_data_ptr->progtype_num = *(short *)ptr ;
// ptr += 2 ;
// 这两句的地方换成下面五句:
myshort.short_byte[0] = *ptr ;
ptr++ ;
myshort.short_byte[1] = *ptr ;
ptr++ ;
g_data_ptr->progtype_num = myshort.short_val ;
// 当然,最简单的方法是换成下面两句:
// memcpy(&(g_data_ptr->progtype_num), ptr, 2) ;
// ptr += 2 ;

对于这个问题的进一步探讨:

在DaVinci上应该注意内存编址模式的问题。
struct {
char struc_char ;
int struc_int ;
short struc_short ;
long struct_long ;
} struc_val ;
在宽松模式下,尽管struc_char只有1个字节,struc_short占2个字节,但编译器可能给这两个变量分别分配了4个字节,结果整个结构的大小变成了16个字节,而在编译器设为紧凑模式时,则正好是11个字节。根据计算机数据总线的位数,不同的编址模式存取数据的速度不一样。我认为在符合总线字长的情况下,效率是最高的,因为只需进行一次总线操作。
内存编址模式会影响字节对齐方式,字节对齐操作可以解决以下两个主要的问题:
1.访存效率问题;一般的编译器要对内存进行对齐,在处理变量时,编译器会根据一定的设置将长短不同的变量的数据长度进行对齐以加快内存处理速度。
2.强制类型转换问题:在x86上,字节不对齐的操作只会影响效率,但是在DaVinci上,可能就是一个Bus error, 因为它要求必须字节对齐。
字节对齐的准则
1.数据类型自身的对齐值:对于char型数据,其自身对齐值为1,对于short型为2,对于int,long,float,double类型,其自身对齐值为4字节。
2.结构体的自身对齐值:其成员中自身对齐值最大的那个值。
3.指定对齐值:#pragma pack (value)时的指定对齐值value。
4.数据成员、结构体和类的有效对齐值:自身对齐值和指定对齐值中小的那个值。
对于平时定义变量,尽可能先定义长度为4的倍数的变量,然后是长度是2的变量,最后是长度为1的变量。

通过测试,GCC编译器是按照4字节对齐存放于内存的。而我还没有发现更改编址模式的参数。程序如下:
#include

int main()
{
struct {
char struc_char ;
int struc_int ;
short struc_short ;
long struct_long ;
} struc_val ;
char c_char ;
int i_int ;
short s_short ;
long l_long ;

printf("sizeof(struc_val) = %d\n", sizeof(struc_val));
printf("sizeof(c_char) = %d\n", sizeof(c_char));
printf("sizeof(i_int) = %d\n", sizeof(i_int));
printf("sizeof(s_short) = %d\n", sizeof(s_short));
printf("sizeof(l_long) = %d\n", sizeof(l_long));

printf("address of struc_val = %p\n", &struc_val);
printf("address of struc_char = %p\n", &(struc_val.struc_char));
printf("address of struc_int = %p\n", &(struc_val.struc_int));
printf("address of struc_short = %p\n", &(struc_val.struc_short));
printf("address of struct_long = %p\n", &(struc_val.struct_long));
printf("address of c_char = %p\n", &c_char);
printf("address of i_int = %p\n", &i_int);
printf("address of s_short = %p\n", &s_short);
printf("address of l_long = %p\n", &l_long);

return 0 ;
}
测试结果:
sizeof(struc_val) = 16
sizeof(c_char) = 1
sizeof(i_int) = 4
sizeof(s_short) = 2
sizeof(l_long) = 4
address of struc_val = 0xbf885278
address of struc_char = 0xbf885278
address of struc_int = 0xbf88527c
address of struc_short = 0xbf885280
address of struct_long = 0xbf885284
address of c_char = 0xbf885277
address of i_int = 0xbf885270
address of s_short = 0xbf88526e
address of l_long = 0xbf885268

所以对于一个32位的数据来讲,如果其没有在4字节整除的内存地址处存放,那么处理器就需要2个总线周期对其进行访问。
0x08 | byte8 | byte9 | byteA | byteB |
0x04 | byte4 | byte5 | byte6 | byte7 |
0x00 | byte0 | byte1 | byte2 | byte3 |
对于我刚刚的那个出现Bus error的程序,假设指针ptr刚好是指向了byte3(地址是0x0),然后想进行short强制类型转换,使用byte3,byte4来构成一个short类型的值,由于第一次总线的数据只有byte0,byte1,byte2,byte3,取不到byte4,这在DaVinci板子上,就是一个Bus error了,因为没有达到边界对齐。如果ptr指的是byte2(地址0x02),就没有问题了。因为0x02地址值是sizeof(short)的整数倍。

你可能感兴趣的:(bus error的解决方法)