很长时间了,一直想对自己的app做加密方面的工作,也看了许多大神的文章,深知这里面坑很多,得一个个去踩。这篇文章就算是个入门日记吧。
我们知道apk实际上就是个压缩包,解压后可获得dex文件。
https://source.android.com/devices/tech/dalvik/dex-format.html
内容都是从官方文档上获取来的,但文档并不容易看的明白。所以写了这篇日记辅助理解,大神就不用看啦
单位 | |
---|---|
byte | 8-bit signed int |
ubyte | 8-bit unsigned int |
short | 16-bit signed int, little-endian |
ushort | 16-bit unsigned int, little-endian |
int | 32-bit signed int, little-endian |
uint | 32-bit unsigned int, little-endian |
long | 64-bit signed int, little-endian |
ulong | 64-bit unsigned int, little-endian |
sleb128 | signed LEB128, variable-length (see below) |
uleb128 | unsigned LEB128, variable-length (see below) |
uleb128p1 | unsigned LEB128 plus 1, variable-length (see below) |
从上面可以看到规律,以u开头的表示无符号类型,比如 byte 占用 8个bit,是有符号的,那么表示范围就是 -128-127, ubyte 的表示范围就是 0-255
byte 转 ubyte;
byte a;
a&0xff; // 对应的ubyte的值
dex文件在这里对用4个byte表示的值使用LEB128来表示,大部分情况是优化的。
每个LEB128都可能由1-5个byte,合在一起表示一个32位的值。
表示逻辑为,每个byte的最高位用来说明还有没有下一个byte
计算逻辑 | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
First byte | Second byte | ||||||||||||||
1 | bit6 | bit5 | bit4 | bit3 | bit2 | bit1 | bit0 | 0 | bit13 | bit12 | bit11 | bit10 | bit9 | bit8 | bit7 |
高位的byte在后面
所以计算公式就是
byte1去掉高位值 + (byte2 去掉高位值<< 7) + (byte3 去掉高位值<< 14)+ (byte4 去掉高位值<< 21)+ (byte5 去掉高位值<< 28)
上面的计算公式是uleb128的计算方式,因为在dex中使用的也是这种格式!
下面是一些转换的案例
16进制 | sleb128 | uleb128 | uleb128p1 |
---|---|---|---|
00 | 0 | 0 | -1 |
01 | 1 | 1 | 0 |
7f | -1 | 127 | 126 |
80 7f | -128 | 16256 | 16255 |
uleb128和16进制转换的代码
public static String getUleb128Int(byte[] byteAry) {
int index = 0;
int result = 0;
for (int len = byteAry.length; index < len; index++) {
result += ((byteAry[index] & 0xff) & 0x7f) << (7 * index); // 因为java里的byte是有符号的,所以需要将其先转换成无符号的byte,& 0x7f操作就是取后7位的值
}
return Integer.toHexString(result);
}
文件格式 | |||
---|---|---|---|
header | header_item | the header | head头,文件校验,以及文件内容索引 |
string_ids | string_id_item[] | string identifiers list. These are identifiers for all the strings used by this file, either for internal naming (e.g., type descriptors) or as constant objects referred to by code. This list must be sorted by string contents, using UTF-16 code point values (not in a locale-sensitive manner), and it must not contain any duplicate entries. | dex文件中使用的所有字符串 |
type_ids | type_id_item[] | type identifiers list. These are identifiers for all types (classes, arrays, or primitive types) referred to by this file, whether defined in the file or not. This list must be sorted by string_id index, and it must not contain any duplicate entries. | dex文件中的所有类型,包括类,数组, |
proto_ids | proto_id_item[] | method prototype identifiers list. These are identifiers for all prototypes referred to by this file. This list must be sorted in return-type (by type_id index) major order, and then by argument list (lexicographic ordering, individual arguments ordered by type_id index). The list must not contain any duplicate entries. | 方法的返回值,以及参数列表组成的方法声明,不包括方法名称!而且声明都是简写 |
field_ids | field_id_item[] | field identifiers list. These are identifiers for all fields referred to by this file, whether defined in the file or not. This list must be sorted, where the defining type (by type_id index) is the major order, field name (by string_id index) is the intermediate order, and type (by type_id index) is the minor order. The list must not contain any duplicate entries. | 域 |
method_ids | method_id_item[] | method identifiers list. These are identifiers for all methods referred to by this file, whether defined in the file or not. This list must be sorted, where the defining type (by type_id index) is the major order, method name (by string_id index) is the intermediate order, and method prototype (by proto_id index) is the minor order. The list must not contain any duplicate entries. | 方法区 |
class_defs | class_def_item[] | class definitions list. The classes must be ordered such that a given class’s superclass and implemented interfaces appear in the list earlier than the referring class. Furthermore, it is invalid for a definition for the same-named class to appear more than once in the list. | 类定义区域 |
data | ubyte[] | data area, containing all the support data for the tables listed above. Different items have different alignment requirements, and padding bytes are inserted before each item if necessary to achieve proper alignment. | 数据区域 |
link_data | ubyte[] | data used in statically linked files. The format of the data in this section is left unspecified by this document. This section is empty in unlinked files, and runtime implementations may use it as they see fit. | 链接数据区域 |
header格式 | |||
---|---|---|---|
magic | ubyte[8] = DEX_FILE_MAGIC | magic value. | |
checksum | uint | adler32 checksum of the rest of the file (everything but magic and this field); used to detect file corruption | |
signature | ubyte[20] | SHA-1 signature (hash) of the rest of the file (everything but magic, checksum, and this field); used to uniquely identify files | |
file_size | uint | size of the entire file (including the header), in bytes | |
header_size | uint = 0x70 | size of the header (this entire section), in bytes. This allows for at least a limited amount of backwards/forwards compatibility without invalidating the format. | 通常是固定值 |
endian_tag | uint = ENDIAN_CONSTANT | endianness tag. | 固定值 |
link_size | uint | size of the link section, or 0 if this file isn’t statically linked | |
link_off | uint | offset from the start of the file to the link section, or 0 if link_size == 0. The offset, if non-zero, should be to an offset into the link_data section. The format of the data pointed at is left unspecified by this document; this header field (and the previous) are left as hooks for use by runtime implementations. | |
map_off | uint | offset from the start of the file to the map item. The offset, which must be non-zero, should be to an offset into the data section, and the data should be in the format specified by “map_list” below. | |
string_ids_size | uint | count of strings in the string identifiers list | String的数量 |
string_ids_off | uint | offset from the start of the file to the string identifiers list, or 0 if string_ids_size == 0 (admittedly a strange edge case). The offset, if non-zero, should be to the start of the string_ids section. | String数据在dex中的位置偏移量,带off的都是表示偏移量 |
type_ids_size | uint | count of elements in the type identifiers list, at most 65535 | 类型数目,最大值65535 |
type_ids_off | uint | offset from the start of the file to the type identifiers list, or 0 if type_ids_size == 0 (admittedly a strange edge case). The offset, if non-zero, should be to the start of the type_ids section. | |
proto_ids_size | uint | count of elements in the prototype identifiers list, at most 65535 | 方法数 |
proto_ids_off | uint | offset from the start of the file to the prototype identifiers list, or 0 if proto_ids_size == 0 (admittedly a strange edge case). The offset, if non-zero, should be to the start of the proto_ids section. | 方法数据在dex文件中的偏移量 |
field_ids_size | uint | count of elements in the field identifiers list | |
field_ids_off | uint | offset from the start of the file to the field identifiers list, or 0 if field_ids_size == 0. The offset, if non-zero, should be to the start of the field_ids section. | |
method_ids_size | uint | count of elements in the method identifiers list | |
method_ids_off | uint | offset from the start of the file to the method identifiers list, or 0 if method_ids_size == 0. The offset, if non-zero, should be to the start of the method_ids section. | |
class_defs_size | uint | count of elements in the class definitions list | |
class_defs_off | uint | offset from the start of the file to the class definitions list, or 0 if class_defs_size == 0 (admittedly a strange edge case). The offset, if non-zero, should be to the start of the class_defs section. | |
data_size | uint | Size of data section in bytes. Must be an even multiple of sizeof(uint). | |
data_off | uint | offset from the start of the file to the start of the data section. |
下面的图片是dex文件用010 editor打开后的header部分
分析后如下表:(注意dex文件里的 uint ushort,ulong,uleb128 都是翻转过的,比如在dex文件中有个uint的文件值是0x8756,那么实际值是0x5687.这一点一定要注意,否则会读错文件。从上文中uleb128的高位在后面就知道了,具体看表,)
header分析 | 单位 | 值() | 转换值 | 开始位置(0x) | 长度(0x) | 说明 |
---|---|---|---|---|---|---|
magic | ubyte[8] | 0x6465 780A 3033 3500. | dex\n035 | 0 | 8 | |
checksum | uint | 0x753F 7C5B | 8 | 4 | ||
signature | ubyte[20] | 0xD67CB4279A72DB6F68C3CB6DC51A86276215B47B | C | 14 | ||
file_size | uint | 0x0000 0A14 | 2580 | 20 | 4 | |
header_size | uint | 0x0000 0070 | 24 | 4 | ||
endian_tag | uint | 0x7856 3412 | 这里不翻转的原因查看3.3节说明 | |||
link_size | uint | 0x0000 0000 | ||||
link_off | uint | 0x0000 0000 | 0表示没有link | |||
map_off | uint | 0x0000 0944 | 表示map区的起始位置在dex文件中的0x0944 | |||
string_ids_size | uint | 0x0000 0032 | 50 | 表示有50个字符串 | ||
string_ids_off | uint | 0x0000 0070 | 通常这个值也是固定的,因为header头的长度固定为0x70,header头之后就是string的数据区域,所以这个String的偏移量也就是0x70了 | |||
type_ids_size | uint | 0x14 | ||||
type_ids_off | uint | 0x138 | ||||
proto_ids_size | uint | 0x4 | ||||
proto_ids_off | uint | 0x188 | ||||
field_ids_size | uint | 0x9 | ||||
field_ids_off | uint | 0x1B8 | ||||
method_ids_size | uint | 0xF | ||||
method_ids_off | uint | 0x0200 | ||||
class_defs_size | uint | 0x8 | ||||
class_defs_off | uint | 0x278 | ||||
data_size | uint | 0x69C | ||||
data_off | uint | 0x378 |
上表中_size表示对应类型的数据大小,_off表示对应数据类型的数据在dex文件中的起始位置。
这个魔数就是header头中的magic部分,表示这个文件格式是dex格式的,以及版本号(就是android编译版本),当前这个版本号是35.
{ 0x64 0x65 0x78 0x0a 0x30 0x33 0x35 0x00 } = “dex\n035\0”
通常是个常量,而且出现在dex文件的最开始的部分。
37版本号对应android 7.0 ,36这个版本号被跳过了,如果某个dex文件中的版本号是36,这是非法的。也就是这个dex文件肯定被篡改了
checksum 是使用adler32加密算法计算dex文件中除了 magic 和 checksum占用的4个byte位置外,文件中其他部分的加密算法值。
signature , 则是使用 SHA-1 算法计算除去 magic ,checksum 和 signature 外余下的所有文件区域 ,
这两个都是用于确认文件使用
f,dex文件大小
header部分size,一般为固定0x70
用来表示文件是否被swap了,通常是有两个值
uint ENDIAN_CONSTANT = 0x12345678;
uint REVERSE_ENDIAN_CONSTANT = 0x78563412;
如果在endian_tag的值为ENDIAN_CONSTANT,那么说明字节是否被翻转了,通常的值是REVERSE_ENDIAN_CONSTANT,也就是是字节翻转的
链接数据的大小和偏移量
表示map的偏移地址,
map中存储了dex文件中可能出现的各种类型,包括header_item,string_id_item等等,map中元素的顺序并不是元素在dex文件中的顺序
map的格式如下
Name | Format | Description |
---|---|---|
size | uint | size of the list, in entries |
list | map_item[size] | elements of the list |
map_item的格式
Name | Format | Description | |
---|---|---|---|
type | ushort | type of the items; | 类型,可见下表 |
unused | ushort | (unused) | 没使用 |
size | uint | count of the number of items to be found at the indicated offset | 数量 |
offset | uint | offset from the start of the file to the items in question | 文件偏移量 |
map_item的type属性值表
Item Type | Constant | Value | Item Size In Bytes |
---|---|---|---|
header_item | TYPE_HEADER_ITEM | 0x0000 | 0x70 |
string_id_item | TYPE_STRING_ID_ITEM | 0x0001 | 0x04 |
type_id_item | TYPE_TYPE_ID_ITEM | 0x0002 | 0x04 |
proto_id_item | TYPE_PROTO_ID_ITEM | 0x0003 | 0x0c |
field_id_item | TYPE_FIELD_ID_ITEM | 0x0004 | 0x08 |
method_id_item | TYPE_METHOD_ID_ITEM | 0x0005 | 0x08 |
class_def_item | TYPE_CLASS_DEF_ITEM | 0x0006 | 0x20 |
map_list | TYPE_MAP_LIST | 0x1000 | 4 + (item.size * 12) |
type_list | TYPE_TYPE_LIST | 0x1001 | 4 + (item.size * 2) |
annotation_set_ref_list | TYPE_ANNOTATION_SET_REF_LIST | 0x1002 | 4 + (item.size * 4) |
annotation_set_item | TYPE_ANNOTATION_SET_ITEM | 0x1003 | 4 + (item.size * 4) |
class_data_item | TYPE_CLASS_DATA_ITEM | 0x2000 | implicit; must parse |
code_item | TYPE_CODE_ITEM | 0x2001 | implicit; must parse |
string_data_item | TYPE_STRING_DATA_ITEM | 0x2002 | implicit; must parse |
debug_info_item | TYPE_DEBUG_INFO_ITEM | 0x2003 | implicit; must parse |
annotation_item | TYPE_ANNOTATION_ITEM | 0x2004 | implicit; must parse |
encoded_array_item | TYPE_ENCODED_ARRAY_ITEM | 0x2005 | implicit; must parse |
annotations_directory_item | TYPE_ANNOTATIONS_DIRECTORY_ITEM | 0x2006 | implicit; must parse |
每个map_item的长度是 ushort+ushort+uint+uint;也就是12byte,
再加上map_size的4个字节
就是map数据的长度是4+map_item.size * 12 byte
和type属性值表中的map_list值一致。
在header的详细分析中知道map_off的值是0x0944,那么找到dex文件中对应的位置(通常map_list的数据位置位于dex文件尾),见如下截图
注意到map中共有0x11个成员,所以 map的结束位置为 4 + 0x944+ 0x11*12 = 0xA14,正好到文件尾。
010Editor的分析结果如下图
再简单说明一下
以map中的最后一个元素为例子分析,可以从文件尾12个byte入手看
- | 原值 | 实际值 | 说明 |
---|---|---|---|
type | 0x0010 | 0x1000 | 在type的表中查找0x1000的值,是一个map_list类型 |
unused | 0x0000 | ||
size | 0x0100 0000 | 0x0000 0001 | 说明只有一个map_list |
offset | 0x4409 0000 | 0x0944 | 就是map的起始位置 |
实际上map_list中的部分数据和header中是相重合的。
string_id_items
从header中看string_ids_size和string_ids_off就能知道string_id表的位置了
string_id_item格式
Name | Format | Description |
---|---|---|
string_data_off | uint | offset from the start of the file to the string data for this item. The offset should be to a location in the data section, and the data should be in the format specified by “string_data_item” below. There is no alignment requirement for the offset. |
string_id表中的每一个item都是一个uint类型,表示这个string在dex文件中的偏移量,指向string_data_item的位置
string_data_item格式
Name | Format | Description |
---|---|---|
utf16_size | uleb128 | size of this string, in UTF-16 code units (which is the “string length” in many systems). That is, this is the decoded length of the string. (The encoded length is implied by the position of the 0 byte.) |
data | ubyte[] | a series of MUTF-8 code units (a.k.a. octets, a.k.a. bytes) followed by a byte of value 0. See “MUTF-8 (Modified UTF-8) Encoding” above for details and discussion about the data format. |
string_data_item的格式也很简单,也是 长度+数据
但是这个长度是用uleb128格式的,所以需要注意,而数据部分使用时的是MUTF-8 编码方式,这种编码方式和utf-8的方式很类似,除了以下4点:
1.Only the one-, two-, and three-byte encodings are used.
2.Code points in the range U+10000 … U+10ffff are encoded as a surrogate pair, each of which is represented as a three-byte encoded value.
3.The code point U+0000 is encoded in two-byte form.
4.A plain null byte (value 0) indicates the end of a string, as is the standard C language interpretation.
简单来说就是比utf-8后面多了一个空字节,但是长度值却不计算这个空字符。
来看看字符串id表中的第一个字符:
其值为 0x0000 053A;也就是偏移值是 0x053A,查看dex文件中此位置
06:长度为6
3C 69 6E 69 74 3E: 字符串
00:结束符
其他的字符就不一一分析了,这里需要注意的是有些字符表示的会比较奇怪,比如:
02 56 4C 00
解析出来的字符串是 VL
这实际上是一个方法的简短声明,返回值是void类型,参数是Object
下表是对应的类型和字符简写对应关系
符号 | 说明 |
---|---|
V | void; only valid for return types |
Z | boolean |
B | byte |
S | short |
C | char |
I | int |
J | long |
F | float |
D | double |
Lfully/qualified/Name; | the class fully.qualified.Name,普通的类L+全类名 |
[descriptor | array of descriptor, usable recursively for arrays-of-arrays, though it is invalid to have more than 255 dimensions.数组,多重数组有多个[ |
type_id表示类:8大基础类型,以及object
从header表中可以看到 type_id 共有20(20*4=0x50 byte)个,偏移地址为0x0138
也就是 type_id_list的dex地址范围是 0x0138- 0x0188 (=0x0138+0x50)
而type_ids中的每一项值都指向String_ids中的值,代表一种类
主要是方法的声明
先来看proto_id_item的数据结构
Name | Format | Description |
---|---|---|
shorty_idx | uint | index into the string_ids list for the short-form descriptor string of this prototype. The string must conform to the syntax for ShortyDescriptor, defined above, and must correspond to the return type and parameters of this item.指向String_ids位置的简要的方法声明 |
return_type_idx | uint | index into the type_ids list for the return type of this prototype,返回值类型,指向type_ids表 |
parameters_off | uint | offset from the start of the file to the list of parameter types for this prototype, or 0 if this prototype has no parameters. This offset, if non-zero, should be in the data section, and the data there should be in the format specified by “type_list” below. Additionally, there should be no reference to the type void in the list.参数列表的数据偏移量,指的是dex文件中的偏移量(凡是后缀是off都指在dex文件中的位置偏移) |
看具体的实例分析:
这里看第二个proto_id了(因为第一个没有参数列表~~)
type | value | Description |
---|---|---|
shorty_idx | 0x1B | 指向String_ids表的角标是27,查看String_ids表,VI;说明返回值是void,参数列表有一个参数I类型也就是int类型 |
return_type_idx | 0x12 | 即指向type_ids表的角标数是18;得知是void类型 |
parameters_off | 0x0524 | 也就是参数列表从0524位置开始 |
下表是type_list的数据格式
Name | Format | Description |
---|---|---|
size | uint | size of the list, in entries |
list | type_item[size] | elements of the list |
type_item format
Name | Format | Description |
---|---|---|
type_idx | ushort | index into the type_ids list |
parameters就是一种type_list
所以从0524开始,我们就可以知道这个方法的参数了
从上图以及type_list格式表中知道,这个参数列表有一个,它指向type_ids表中的0位置的值,也就是第一个值,查找发现是int
也就是这个方法的返回值和参数都确定了。
只是奇怪的是明明用shorty_idx就可以表示这个方法的返回值和参数了,为什么还要把返回值和参数列表分开来再记录一次呢?
成员id列表,不详细解释。
field_id_item :appears in the field_ids sectio
Name | Format | Description |
---|---|---|
class_idx | ushort | index into the type_ids list for the definer of this field. This must be a class type, and not an array or primitive type. |
type_idx | ushort | index into the type_ids list for the type of this field |
name_idx | uint | index into the string_ids list for the name of this field. The string must conform to the syntax for MemberName, defined above. |
appears in the method_ids section
Name | Format | Description |
---|---|---|
class_idx | ushort | index into the type_ids list for the definer of this method. This must be a class or array type, and not a primitive type. |
proto_idx | ushort | index into the proto_ids list for the prototype of this method |
name_idx | uint | index into the string_ids list for the name of this method. The string must conform to the syntax for MemberName, defined above. |
看到这里,可能会有点奇怪,field和method都是有accessflag的也就是权限问题,比如private,protect final static 等修饰词;实际上这些都不在field_ids和method_ids里有,都放置在了class_def里了
class_def_item
appears in the class_defs section
这里篇幅所限,不详细解释,仅留下格式表格
Name | Format | Description |
---|---|---|
class_idx | uint | index into the type_ids list for this class. This must be a class type, and not an array or primitive type. |
access_flags | uint | access flags for the class (public, final, etc.). See “access_flags Definitions” for details. |
superclass_idx | uint | index into the type_ids list for the superclass, or the constant value NO_INDEX if this class has no superclass (i.e., it is a root class such as Object). If present, this must be a class type, and not an array or primitive type. |
interfaces_off | uint | offset from the start of the file to the list of interfaces, or 0 if there are none. This offset should be in the data section, and the data there should be in the format specified by “type_list” below. Each of the elements of the list must be a class type (not an array or primitive type), and there must not be any duplicates. |
source_file_idx | uint | index into the string_ids list for the name of the file containing the original source for (at least most of) this class, or the special value NO_INDEX to represent a lack of this information. The debug_info_item of any given method may override this source file, but the expectation is that most classes will only come from one source file. |
annotations_off | uint | offset from the start of the file to the annotations structure for this class, or 0 if there are no annotations on this class. This offset, if non-zero, should be in the data section, and the data there should be in the format specified by “annotations_directory_item” below, with all items referring to this class as the definer. |
class_data_off | uint | offset from the start of the file to the associated class data for this item, or 0 if there is no class data for this class. (This may be the case, for example, if this class is a marker interface.) The offset, if non-zero, should be in the data section, and the data there should be in the format specified by “class_data_item” below, with all items referring to this class as the definer. |
static_values_off | uint | offset from the start of the file to the list of initial values for static fields, or 0 if there are none (and all static fields are to be initialized with 0 or null). This offset should be in the data section, and the data there should be in the format specified by “encoded_array_item” below. The size of the array must be no larger than the number of static fields declared by this class, and the elements correspond to the static fields in the same order as declared in the corresponding field_list. The type of each array element must match the declared type of its corresponding field. If there are fewer elements in the array than there are static fields, then the leftover fields are initialized with a type-appropriate 0 or null. |
class_data_off:对应的class的数据,方法,成员变量都在这里有定义,具体格式如下
class_data_item
Name | Format | Description |
---|---|---|
static_fields_size | uleb128 | the number of static fields defined in this item |
instance_fields_size | uleb128 | the number of instance fields defined in this item |
direct_methods_size | uleb128 | the number of direct methods defined in this item |
virtual_methods_size | uleb128 | the number of virtual methods defined in this item |
static_fields | encoded_field[static_fields_size] | the defined static fields, represented as a sequence of encoded elements. The fields must be sorted by field_idx in increasing order. |
instance_fields | encoded_field[instance_fields_size] | the defined instance fields, represented as a sequence of encoded elements. The fields must be sorted by field_idx in increasing order. |
direct_methods | encoded_method[direct_methods_size] | the defined direct (any of static, private, or constructor) methods, represented as a sequence of encoded elements. The methods must be sorted by method_idx in increasing order. |
virtual_methods | encoded_method[virtual_methods_size] | the defined virtual (none of static, private, or constructor) methods, represented as a sequence of encoded elements. This list should not include inherited methods unless overridden by the class that this item represents. The methods must be sorted by method_idx in increasing order. The method_idx of a virtual method must not be the same as any direct method. |
详细的field的定义格式
encoded_field format
Name | Format | Description |
---|---|---|
field_idx_diff | uleb128 | index into the field_ids list for the identity of this field (includes the name and descriptor), represented as a difference from the index of previous element in the list. The index of the first element in a list is represented directly. |
access_flags | uleb128 | access flags for the field (public, final, etc.). See “access_flags Definitions” for details. |
详细的方法定义格式
encoded_method format
Name | Format | Description |
---|---|---|
method_idx_diff | uleb128 | index into the method_ids list for the identity of this method (includes the name and descriptor), represented as a difference from the index of previous element in the list. The index of the first element in a list is represented directly. |
access_flags | uleb128 | access flags for the method (public, final, etc.). See “access_flags Definitions” for details. |
code_off | uleb128 | offset from the start of the file to the code structure for this method, or 0 if this method is either abstract or native. The offset should be to a location in the data section. The format of the data is specified by “code_item” below. |
当然在方法里还有一个比较重要的定义就是方法属性
code_item
Name | Format | Description |
---|---|---|
registers_size | ushort | the number of registers used by this code |
ins_size | ushort | the number of words of incoming arguments to the method that this code is for |
outs_size | ushort | the number of words of outgoing argument space required by this code for method invocation |
tries_size | ushort | the number of try_items for this instance. If non-zero, then these appear as the tries array just after the insns in this instance. |
debug_info_off | uint | offset from the start of the file to the debug info (line numbers + local variable info) sequence for this code, or 0 if there simply is no information. The offset, if non-zero, should be to a location in the data section. The format of the data is specified by “debug_info_item” below. |
insns_size | uint | size of the instructions list, in 16-bit code units |
insns | ushort[insns_size] | actual array of bytecode. The format of code in an insns array is specified by the companion document Dalvik bytecode. Note that though this is defined as an array of ushort, there are some internal structures that prefer four-byte alignment. Also, if this happens to be in an endian-swapped file, then the swapping is only done on individual ushorts and not on the larger internal structures. |
padding | ushort (optional) = 0 | two bytes of padding to make tries four-byte aligned. This element is only present if tries_size is non-zero and insns_size is odd. |
tries | try_item[tries_size] (optional) | array indicating where in the code exceptions are caught and how to handle them. Elements of the array must be non-overlapping in range and in order from low to high address. This element is only present if tries_size is non-zero. |
handlers | encoded_catch_handler_list (optional) | bytes representing a list of lists of catch types and associated handler addresses. Each try_item has a byte-wise offset into this structure. This element is only present if tries_size is non-zero. |
try -catch模块
try模块
try_item format
Name | Format | Description |
---|---|---|
start_addr uint | start address of the block of code covered by this entry. The address is a count of 16-bit code units to the start of the first covered instruction. | |
insn_count | ushort | number of 16-bit code units covered by this entry. The last code unit covered (inclusive) is start_addr + insn_count - 1. |
handler_off | ushort | offset in bytes from the start of the associated encoded_catch_hander_list to the encoded_catch_handler for this entry. This must be an offset to the start of an encoded_catch_handler. |
handler模块-catch~
encoded_catch_handler_list format
Name | Format | Description |
---|---|---|
size | uleb128 | size of this list, in entries |
list | encoded_catch_handler[handlers_size] | actual list of handler lists, represented directly (not as offsets), and concatenated sequentially |
encoded_catch_handler format
Name | Format | Description |
---|---|---|
size | sleb128 | number of catch types in this list. If non-positive, then this is the negative of the number of catch types, and the catches are followed by a catch-all handler. For example: A size of 0 means that there is a catch-all but no explicitly typed catches. A size of 2 means that there are two explicitly typed catches and no catch-all. And a size of -1 means that there is one typed catch along with a catch-all. |
handlers | encoded_type_addr_pair[abs(size)] | stream of abs(size) encoded items, one for each caught type, in the order that the types should be tested. |
catch_all_addr | uleb128 (optional) | bytecode address of the catch-all handler. This element is only present if size is non-positive. |
encoded_type_addr_pair format
Name | Format | Description |
---|---|---|
type_idx | uleb128 | index into the type_ids list for the type of the exception to catch |
addr | uleb128 | bytecode address of the associated exception handler |
还有debuginfo和annotation部分。如有兴趣查看官网文档https://source.android.com/devices/tech/dalvik/dex-format.html