lua5.1字节码文件分析

http://www.doc88.com/p-2853755460794.html
其中对lua5.1字节码已做出详细介绍,我这里就不重复了,只讲解一个字节码分析的例子

概要

lua5.1字节码文件分析_第1张图片

lua5.1字节码文件分析_第2张图片

lua源文件

sample.lua

local a = 0; 
local c = 1.1
b = "stringType"
d = false
e = {}
f ={1,2,3}

function HaveParameter(x,y)

end

function NoParameter()
    local function NoParameter1()
        local m=4
    end 
    function NoParameter2()
        local m=5
    end  
    return 
end

for i = 1, 10 do 
    a = a + 2
 end

for k,v in pairs(f) do
    print(k)
end

编译后的lua文件

luac.out

lua5.1字节码文件分析_第3张图片

阅读注意,大小有关的指令从后往前阅读(小端) 比如 01000000代表的是0x00000001

Pos   Hex Data           Description or Code
----------------------------------------------------------------------
0000                     ** source chunk: sample.lua
                         ** global header start **
0000  1B4C7561           header signature: "\27Lua"
0004  51                 version (major:minor hex digits)
0005  00                 format (0=official)
0006  01                 endianness (1=little endian)
0007  04                 size of int (bytes)
0008  04                 size of size_t (bytes)
0009  04                 size of Instruction (bytes)
000A  08                 size of number (bytes)
000B  00                 integral (1=integral)
                         * number type: double
                         * x86 standard (32-bit, little endian, double
                         ** global header end **

000C                     ** function [0] definition (level 1)
                         ** start of function **
000C  0B000000           string size (11)
0010  73616D706C652E6C+  "sample.l"
0018  756100             "ua\0"
                         source name: sample.lua
001B  00000000           line defined (0)
001F  00000000           last line defined (0)
0023  00                 nups (0)
0024  00                 numparams (0)
0025  02                 is_vararg (2)
0026  09                 maxstacksize (9)
                         * code:
0027  22000000           sizecode (34)
002B  01000000           [01] loadk      0   0        ; 0
002F  41400000           [02] loadk      1   1        ; 1.1
0033  81C00000           [03] loadk      2   3        ; "stringType"
0037  87800000           [04] setglobal  2   2        ; b
003B  82000000           [05] loadbool   2   0   0    ; false
003F  87000100           [06] setglobal  2   4        ; d
0043  8A000000           [07] newtable   2   0   0    ; array=0, hash=
0047  87400100           [08] setglobal  2   5        ; e
004B  8A008001           [09] newtable   2   3   0    ; array=3, hash=
004F  C1C00100           [10] loadk      3   7        ; 1
0053  01010200           [11] loadk      4   8        ; 2
0057  41410200           [12] loadk      5   9        ; 3
005B  A2408001           [13] setlist    2   3   1    ; index 1 to 3
005F  87800100           [14] setglobal  2   6        ; f
0063  A4000000           [15] closure    2   0        ; 0 upvalues
0067  87800200           [16] setglobal  2   10       ; HaveParameter
006B  A4400000           [17] closure    2   1        ; 0 upvalues
006F  87C00200           [18] setglobal  2   11       ; NoParameter
0073  81C00100           [19] loadk      2   7        ; 1
0077  C1000300           [20] loadk      3   12       ; 10
007B  01C10100           [21] loadk      4   7        ; 1
007F  A0000080           [22] forprep    2   1        ; to [24]
0083  0C004200           [23] add        0   0   264  ; 2
0087  9F40FF7F           [24] forloop    2   -2       ; to [23] if loo
008B  85400300           [25] getglobal  2   13       ; pairs
008F  C5800100           [26] getglobal  3   6        ; f
0093  9C000101           [27] call       2   2   4
0097  16800080           [28] jmp        3            ; to [32]
009B  C5810300           [29] getglobal  7   14       ; print
009F  00028002           [30] move       8   5
00A3  DC410001           [31] call       7   2   1
00A7  A1800000           [32] tforloop   2       2    ; to [34] if exi
00AB  1680FE7F           [33] jmp        -5           ; to [29]
00AF  1E008000           [34] return     0   1
                         * constants:
00B3  0F000000           sizek (15)
00B7  03                 const type 3
00B8  0000000000000000   const [0]: (0)
00C0  03                 const type 3
00C1  9A9999999999F13F   const [1]: (1.1)
00C9  04                 const type 4
00CA  02000000           string size (2)
00CE  6200               "b\0"
                         const [2]: "b"
00D0  04                 const type 4
00D1  0B000000           string size (11)
00D5  737472696E675479+  "stringTy"
00DD  706500             "pe\0"
                         const [3]: "stringType"
00E0  04                 const type 4
00E1  02000000           string size (2)
00E5  6400               "d\0"
                         const [4]: "d"
00E7  04                 const type 4
00E8  02000000           string size (2)
00EC  6500               "e\0"
                         const [5]: "e"
00EE  04                 const type 4
00EF  02000000           string size (2)
00F3  6600               "f\0"
                         const [6]: "f"
00F5  03                 const type 3
00F6  000000000000F03F   const [7]: (1)
00FE  03                 const type 3
00FF  0000000000000040   const [8]: (2)
0107  03                 const type 3
0108  0000000000000840   const [9]: (3)
0110  04                 const type 4
0111  0E000000           string size (14)
0115  4861766550617261+  "HavePara"
011D  6D6574657200       "meter\0"
                         const [10]: "HaveParameter"
0123  04                 const type 4
0124  0C000000           string size (12)
0128  4E6F506172616D65+  "NoParame"
0130  74657200           "ter\0"
                         const [11]: "NoParameter"
0134  03                 const type 3
0135  0000000000002440   const [12]: (10)
013D  04                 const type 4
013E  06000000           string size (6)
0142  706169727300       "pairs\0"
                         const [13]: "pairs"
0148  04                 const type 4
0149  06000000           string size (6)
014D  7072696E7400       "print\0"
                         const [14]: "print"
                         * functions:
0153  02000000           sizep (2)

0157                     ** function [0] definition (level 2)
                         ** start of function **
0157  00000000           string size (0)
                         source name: (none)
015B  08000000           line defined (8)
015F  0A000000           last line defined (10)
0163  00                 nups (0)
0164  02                 numparams (2)
0165  00                 is_vararg (0)
0166  02                 maxstacksize (2)
                         * code:
0167  01000000           sizecode (1)
016B  1E008000           [1] return     0   1
                         * constants:
016F  00000000           sizek (0)
                         * functions:
0173  00000000           sizep (0)
                         * lines:
0177  01000000           sizelineinfo (1)
                         [pc] (line)
017B  0A000000           [1] (10)
                         * locals:
017F  02000000           sizelocvars (2)
0183  02000000           string size (2)
0187  7800               "x\0"
                         local [0]: x
0189  00000000             startpc (0)
018D  00000000             endpc   (0)
0191  02000000           string size (2)
0195  7900               "y\0"
                         local [1]: y
0197  00000000             startpc (0)
019B  00000000             endpc   (0)
                         * upvalues:
019F  00000000           sizeupvalues (0)
                         ** end of function **


01A3                     ** function [1] definition (level 2)
                         ** start of function **
01A3  00000000           string size (0)
                         source name: (none)
01A7  0C000000           line defined (12)
01AB  14000000           last line defined (20)
01AF  00                 nups (0)
01B0  00                 numparams (0)
01B1  00                 is_vararg (0)
01B2  02                 maxstacksize (2)
                         * code:
01B3  05000000           sizecode (5)
01B7  24000000           [1] closure    0   0        ; 0 upvalues
01BB  64400000           [2] closure    1   1        ; 0 upvalues
01BF  47000000           [3] setglobal  1   0        ; NoParameter2
01C3  1E008000           [4] return     0   1
01C7  1E008000           [5] return     0   1
                         * constants:
01CB  01000000           sizek (1)
01CF  04                 const type 4
01D0  0D000000           string size (13)
01D4  4E6F506172616D65+  "NoParame"
01DC  7465723200         "ter2\0"
                         const [0]: "NoParameter2"
                         * functions:
01E1  02000000           sizep (2)

01E5                     ** function [0] definition (level 3)
                         ** start of function **
01E5  00000000           string size (0)
                         source name: (none)
01E9  0D000000           line defined (13)
01ED  0F000000           last line defined (15)
01F1  00                 nups (0)
01F2  00                 numparams (0)
01F3  00                 is_vararg (0)
01F4  02                 maxstacksize (2)
                         * code:
01F5  02000000           sizecode (2)
01F9  01000000           [1] loadk      0   0        ; 4
01FD  1E008000           [2] return     0   1
                         * constants:
0201  01000000           sizek (1)
0205  03                 const type 3
0206  0000000000001040   const [0]: (4)
                         * functions:
020E  00000000           sizep (0)
                         * lines:
0212  02000000           sizelineinfo (2)
                         [pc] (line)
0216  0E000000           [1] (14)
021A  0F000000           [2] (15)
                         * locals:
021E  01000000           sizelocvars (1)
0222  02000000           string size (2)
0226  6D00               "m\0"
                         local [0]: m
0228  01000000             startpc (1)
022C  01000000             endpc   (1)
                         * upvalues:
0230  00000000           sizeupvalues (0)
                         ** end of function **


0234                     ** function [1] definition (level 3)
                         ** start of function **
0234  00000000           string size (0)
                         source name: (none)
0238  10000000           line defined (16)
023C  12000000           last line defined (18)
0240  00                 nups (0)
0241  00                 numparams (0)
0242  00                 is_vararg (0)
0243  02                 maxstacksize (2)
                         * code:
0244  02000000           sizecode (2)
0248  01000000           [1] loadk      0   0        ; 5
024C  1E008000           [2] return     0   1
                         * constants:
0250  01000000           sizek (1)
0254  03                 const type 3
0255  0000000000001440   const [0]: (5)
                         * functions:
025D  00000000           sizep (0)
                         * lines:
0261  02000000           sizelineinfo (2)
                         [pc] (line)
0265  11000000           [1] (17)
0269  12000000           [2] (18)
                         * locals:
026D  01000000           sizelocvars (1)
0271  02000000           string size (2)
0275  6D00               "m\0"
                         local [0]: m
0277  01000000             startpc (1)
027B  01000000             endpc   (1)
                         * upvalues:
027F  00000000           sizeupvalues (0)
                         ** end of function **

                         * lines:
0283  05000000           sizelineinfo (5)
                         [pc] (line)
0287  0F000000           [1] (15)
028B  12000000           [2] (18)
028F  10000000           [3] (16)
0293  13000000           [4] (19)
0297  14000000           [5] (20)
                         * locals:
029B  01000000           sizelocvars (1)
029F  0D000000           string size (13)
02A3  4E6F506172616D65+  "NoParame"
02AB  7465723100         "ter1\0"
                         local [0]: NoParameter1
02B0  01000000             startpc (1)
02B4  04000000             endpc   (4)
                         * upvalues:
02B8  00000000           sizeupvalues (0)
                         ** end of function **

                         * lines:
02BC  22000000           sizelineinfo (34)
                         [pc] (line)
02C0  01000000           [01] (1)
02C4  02000000           [02] (2)
02C8  03000000           [03] (3)
02CC  03000000           [04] (3)
02D0  04000000           [05] (4)
02D4  04000000           [06] (4)
02D8  05000000           [07] (5)
02DC  05000000           [08] (5)
02E0  06000000           [09] (6)
02E4  06000000           [10] (6)
02E8  06000000           [11] (6)
02EC  06000000           [12] (6)
02F0  06000000           [13] (6)
02F4  06000000           [14] (6)
02F8  0A000000           [15] (10)
02FC  08000000           [16] (8)
0300  14000000           [17] (20)
0304  0C000000           [18] (12)
0308  16000000           [19] (22)
030C  16000000           [20] (22)
0310  16000000           [21] (22)
0314  16000000           [22] (22)
0318  17000000           [23] (23)
031C  16000000           [24] (22)
0320  1A000000           [25] (26)
0324  1A000000           [26] (26)
0328  1A000000           [27] (26)
032C  1A000000           [28] (26)
0330  1B000000           [29] (27)
0334  1B000000           [30] (27)
0338  1B000000           [31] (27)
033C  1A000000           [32] (26)
0340  1B000000           [33] (27)
0344  1C000000           [34] (28)
                         * locals:
0348  0B000000           sizelocvars (11)
034C  02000000           string size (2)
0350  6100               "a\0"
                         local [0]: a
0352  01000000             startpc (1)
0356  21000000             endpc   (33)
035A  02000000           string size (2)
035E  6300               "c\0"
                         local [1]: c
0360  02000000             startpc (2)
0364  21000000             endpc   (33)
0368  0C000000           string size (12)
036C  28666F7220696E64+  "(for ind"
0374  65782900           "ex)\0"
                         local [2]: (for index)
0378  15000000             startpc (21)
037C  18000000             endpc   (24)
0380  0C000000           string size (12)
0384  28666F72206C696D+  "(for lim"
038C  69742900           "it)\0"
                         local [3]: (for limit)
0390  15000000             startpc (21)
0394  18000000             endpc   (24)
0398  0B000000           string size (11)
039C  28666F7220737465+  "(for ste"
03A4  702900             "p)\0"
                         local [4]: (for step)
03A7  15000000             startpc (21)
03AB  18000000             endpc   (24)
03AF  02000000           string size (2)
03B3  6900               "i\0"
                         local [5]: i
03B5  16000000             startpc (22)
03B9  17000000             endpc   (23)
03BD  10000000           string size (16)
03C1  28666F722067656E+  "(for gen"
03C9  657261746F722900   "erator)\0"
                         local [6]: (for generator)
03D1  1B000000             startpc (27)
03D5  21000000             endpc   (33)
03D9  0C000000           string size (12)
03DD  28666F7220737461+  "(for sta"
03E5  74652900           "te)\0"
                         local [7]: (for state)
03E9  1B000000             startpc (27)
03ED  21000000             endpc   (33)
03F1  0E000000           string size (14)
03F5  28666F7220636F6E+  "(for con"
03FD  74726F6C2900       "trol)\0"
                         local [8]: (for control)
0403  1B000000             startpc (27)
0407  21000000             endpc   (33)
040B  02000000           string size (2)
040F  6B00               "k\0"
                         local [9]: k
0411  1C000000             startpc (28)
0415  1F000000             endpc   (31)
0419  02000000           string size (2)
041D  7600               "v\0"
                         local [10]: v
041F  1C000000             startpc (28)
0423  1F000000             endpc   (31)
                         * upvalues:
0427  00000000           sizeupvalues (0)
                         ** end of function **

042B                     ** end of chunk **

提供一个把32位指令转换为字节码函数


extern "C"
{
    #include "lauxlib.h"  
    #include "lualib.h"  
    #include "lua.h"
    #include "lopcodes.h"
}

#include "stdlib.h"
static void PrintInstruction(Instruction i)
{
    OpCode o = GET_OPCODE(i);
    int a = GETARG_A(i);
    int b = GETARG_B(i);
    int c = GETARG_C(i);
    int bx = GETARG_Bx(i);
    int sbx = GETARG_sBx(i);
    //输出操作码对应的名字
    printf("%-9s\t", luaP_opnames[o]);
    switch (getOpMode(o))
    {
    //操作码对应的基本指令有3种,enum OpMode {iABC, iABx, iAsBx}; 
    //操作属性 操作寄存器还是常量还是无  enum OpArgMask {OpArgN,  OpArgU,  OpArgR, OpArgK   };

    case iABC:
        //3个操作数 
        printf("%d", a);
        //常量的话去掉最高位取负减1,用负数来区分寄存器和常量
        if (getBMode(o) != OpArgN) printf(" %d", ISK(b) ? (-1 - INDEXK(b)) : b);
        if (getCMode(o) != OpArgN) printf(" %d", ISK(c) ? (-1 - INDEXK(c)) : c);
        break;
    case iABx:
        //2个操作数
        //常量就直接取负减1,用负数来区分寄存器和常量
        if (getBMode(o) == OpArgK) printf("%d %d", a, -1 - bx); else printf("%d %d", a, bx);
        break;
    case iAsBx:
        //带符号的2个操作数,或者只有一个,还有一个操作数可能不用的
        if (o == OP_JMP) printf("%d", sbx); else printf("%d %d", a, sbx);
        break;
    default:
        break;
    }

    printf("\n");
}
//#define BITRK     (1 << (8))
/* test whether value is a constant */
/* this bit 1 means constant (0 means register) */
#define ISK(x)      ((x) & BITRK)

int main(int argc, char* argv[])
{
    //比如LOADK 0 -1,这里生成的和前面LOADK 0 0 不一样是因为表达方式不一样,负数来区分常量表还是注册表,易于阅读
    //保持和luac -l -l sample.lua 生成的指令格式一致
    Instruction i = 0x00000001;
    PrintInstruction(i);
    return 0;
}

其它阅读笔记,主要参考lopcodes.h ,lopcodes.c,print.c

#define BITRK       (1 << (8))
/* test whether value is a constant */
/* this bit 1 means constant (0 means register) */
#define ISK(x)      ((x) & BITRK)
 1代表常量,最高位为1,常量的话去掉这位,实际8位,寄存器最大255
 #define INDEXK(r)  ((int)(r) & ~BITRK)

1代表常量,最高位为1,常量的话去掉这位,实际8位,寄存器最大255
#define INDEXK(r) ((int)(r) & ~BITRK)

总结:

lua中的常量是直接保存在文件中的,比如字符,数字(整数和浮点数),数字是采用浮点数编码
指令长度为32位,包含操作码和操作数
每个结尾指令后面都会默认添加return 0 1
lua5.2的字节码文件有变化,由于参考资料较少,先用lua5.1做个基础
执行过程主要可参考lcode.c,生成字节码lparser.c解析字节码,lvm.c执行字节码

附录

浮点数计算方式:http://blog.csdn.net/chenyujing1234/article/details/7683635

你可能感兴趣的:(lua,小菜鸟从C看lua)