写一个简单的解释器(1) 编译过程简介

编译过程

一般来说,将一份源代码编译为可执行文件包含下面的关键步骤:
源文件 ⇒ 构建标记流 ⇒ 构建编译树 ⇒ 生成可执行文件 \texttt{源文件}\Rightarrow \texttt{构建标记流}\Rightarrow \texttt{构建编译树}\Rightarrow\texttt{生成可执行文件} 源文件构建标记流构建编译树生成可执行文件
以下面一段代码举例:

namespace space1 {
    class A {
        var int a, b;
        func __init__(int a, int b) { this.a = a, this.b = b; }
        public func int sum() { return a + b; }
    }
    func calc(int a, int b, int c, int d) {
        var A v1 = A(a, b), v2 = B(c, d)
        return math::max(v1.sum(), v2.sum());
    }
}

标记流

将代码分割成若干个最小的,有意义的单元的操作,叫做标记流构建,而一个标记就是一个单元,单元有多种类型,比如关键字,标识符,运算符,各种括号,分号…
比如上述代码的第 5 5 5 行,用构建成标记流就是:

[Keyword public] [Keyword func] [Identifier int] [Identifier sum] [SmallBracketL] [SmallBracketR] 
[LargeBracketL] [Keyword return] [Identifier a] [Operator +] [Identifier b] [ExpressionEnd] [LargeBracketR]

编译树

将标记流进行一定的处理,然后用树形结构组织起来,形成代码层级的父子关系。
比如对于上面的代码,构建成编译树就是:

[Namespace space1] 
    [Class A] 
        [VariableDefinition]
            [Type int]
            [Identifier a]
            [Identifier b]
        [FunctionDefinition]
            [Accessibility public]
            [Type A]
            [Identifier A]
            [Argument]
                [Type int]
                [Identifier a]
            [Argument]
                [Type int]
                [Identifier b]
            [Block]
                [Expression]
                    [Comma]
                        [Assign]
                            [CallMember]
                                [Identifier this]
                                [Identifier a]
                            [Identifier a]
                        [Assign]
                            [CallMember]
                                [Identifier this]
                                [Identifier b]
                            [Identifier b]
        [FunctionDefinition]
            [Accessibility public]
            [Type int]
            [Identifier sum]
            [Block]
                [Return]
                    [Expression]
                        [Add]
                            [Identifier a]
                            [Identifier b]
        ...

生成可执行文件

根据编译树,生成对应字节码,这里参照 Java \texttt{Java} Java 设计了一个指令集。

enum class CommandID {
    label,
    vbmov, vi32mov, vi64mov, vfmov, vomov, mbmov, mi32mov, mi64mov, mfmov, momov,
    add, sub, mul, _div, mod, ladd, lsub, lmul, _ldiv, lmod, fadd, fsub, fmul, fdiv, uadd, usub, umul, udiv, umod, badd, bsub, bmul, bdiv, bmod,
    eq, ne, gt, ge, ls, le, feq, fne, fgt, fge, fls, fle,
    _and, _or, _xor, _not, lmv, rmv, land, lor, lxor, lnot, llmv, lrmv, uand, uor, uxor, unot, ulmv, urmv, band, bor, bxor, bnot, blmv, brmv,
    ret, opop, pop,
    vbgvl, vi32gvl, vi64gvl, vfgvl, vogvl, mbgvl, mi32gvl, mi64gvl, mfgvl, mogvl,
    push0, push1,
    pvar0, pvar1, pvar2, pvar3, povar0, povar1, povar2, povar3,
    arrmem1, arromem1,
    pack, unpack, 
    _new, 
    jmp, jz, jp,
    setvar,
    poparg, 
    push,
    pvar, povar, pglo, poglo, pstr,
    mem, omem, 
    sys,
    arrnew, arrmem, arromem,
    call, ecall
}

先不解释具体含义。

你可能感兴趣的:(解释器的实现,算法,c语言,开发语言)