了解Go编译处理(三)—— 初识go compile

前言

了解Go编译处理(二)—— go build一文中对go build的过程进行了追踪,build的源码中并不负责代码的编译,而是交给专门的编译工具进行编译,完整的build过程使用到以下工具:

cover-cgo-compile-asm-pack-buildid-link

部分工具因文件类型的不一或编译参数的设置,可能不会调用。

关于.go的文件的编译由compile工具进行处理,本文先大致了解下compile的大致处理过程。

compile

compile是指位于安装目录($/go/pkg/tool/$GOOS_$GOARCH)下compile工具,build过程中会调用compile相关命令对文件等内容进行处理。

直接运行compile命令,可看到如下提示:

usage: compile [options] file.go...
  -%    debug non-static initializers
  -+    compiling runtime
  -B    disable bounds checking
  -C    disable printing of columns in error messages
  -D path
        set relative path for local imports
 ...

compile本身也是go实现的,其源码位于src/cmd/compile目录下(源码的查找小秘诀,我们在之前的文章中说过了呦)。

编译入口compile/main

var archInits = map[string]func(*gc.Arch){
    "386":      x86.Init,
    "amd64":    amd64.Init,
    "arm":      arm.Init,
    "arm64":    arm64.Init,
    "mips":     mips.Init,
    "mipsle":   mips.Init,
    "mips64":   mips64.Init,
    "mips64le": mips64.Init,
    "ppc64":    ppc64.Init,
    "ppc64le":  ppc64.Init,
    "riscv64":  riscv64.Init,
    "s390x":    s390x.Init,
    "wasm":     wasm.Init,
}
func main() {
    // disable timestamps for reproducible output
    log.SetFlags(0)
    log.SetPrefix("compile: ")

    archInit, ok := archInits[objabi.GOARCH]
    if !ok {
        fmt.Fprintf(os.Stderr, "compile: unknown architecture %q\n", objabi.GOARCH)
        os.Exit(2)
    }

    gc.Main(archInit)
    gc.Exit(0)
}

archInits包含了各个处理器架构的处理初始化方式。main中会通过获取当前处理器架构,再从archInits获取archInit,最后调用gc的Main及Exit。

main主要是为调用gc中的相关方法提供一个入口。

gc

此处的gc的意思是Go compiler,并不是常见的垃圾收集器。

Main

根据Main的注释可以知道:Main对命令行中flags及go源文件进行解析,对解析的包进行类型检查,然后将functions编译成机器码,最后将编译好的文件写入磁盘。

Main的代码实在太长了,此处仅保留关注处。

parseFiles是最关键的部分,负责文件的解析、语法解析、转换,封装成一个个的Node,然后append至xtop,随后就是对xtop中数据的检查。

// Main parses flags and Go source files specified in the command-line
// arguments, type-checks the parsed Go package, compiles functions to machine
// code, and finally writes the compiled package definition to disk.
func Main(archInit func(*Arch)) {
    ...
    // pseudo-package, for scoping
    builtinpkg = types.NewPkg("go.builtin", "") // TODO(gri) name this package go.builtin?
    builtinpkg.Prefix = "go.builtin"            // not go%2ebuiltin

    // pseudo-package, accessed by import "unsafe"
    unsafepkg = types.NewPkg("unsafe", "unsafe")
    ...
    // 头部是一系列flags的说明及解析,这里对应的就是前言中提到的命令提示哦
    // 这里还可以看到花式的flag的使用方式,如指定func,感兴趣的可以了解下
    flag.BoolVar(&compiling_runtime, "+", false, "compiling runtime")
    flag.BoolVar(&compiling_std, "std", false, "compiling standard library")
    ...
    lines := parseFiles(flag.Args())//解析文件,将import、var、const、type、func等相关的声明封装成node,然后append至xtop
    ...

    // Process top-level declarations in phases.

    // Phase 1: const, type, and names and types of funcs.
    //   This will gather all the information about types
    //   and methods but doesn't depend on any of it.
    //
    //   We also defer type alias declarations until phase 2
    //   to avoid cycles like #18640.
    //   TODO(gri) Remove this again once we have a fix for #25838.

    // Don't use range--typecheck can add closures to xtop.
    timings.Start("fe", "typecheck", "top1")
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if op := n.Op; op != ODCL && op != OAS && op != OAS2 && (op != ODCLTYPE || !n.Left.Name.Param.Alias) {
            xtop[i] = typecheck(n, ctxStmt)
        }
    }

    // Phase 2: Variable assignments.
    //   To check interface assignments, depends on phase 1.

    // Don't use range--typecheck can add closures to xtop.
    timings.Start("fe", "typecheck", "top2")
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if op := n.Op; op == ODCL || op == OAS || op == OAS2 || op == ODCLTYPE && n.Left.Name.Param.Alias {
            xtop[i] = typecheck(n, ctxStmt)
        }
    }

    // Phase 3: Type check function bodies.
    // Don't use range--typecheck can add closures to xtop.
    timings.Start("fe", "typecheck", "func")
    var fcount int64
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if op := n.Op; op == ODCLFUNC || op == OCLOSURE {
            Curfn = n
            decldepth = 1
            saveerrors()
            typecheckslice(Curfn.Nbody.Slice(), ctxStmt)
            checkreturn(Curfn)
            if nerrors != 0 {
                Curfn.Nbody.Set(nil) // type errors; do not compile
            }
            // Now that we've checked whether n terminates,
            // we can eliminate some obviously dead code.
            deadcode(Curfn)
            fcount++
        }
    }
    // With all types checked, it's now safe to verify map keys. One single
    // check past phase 9 isn't sufficient, as we may exit with other errors
    // before then, thus skipping map key errors.
    checkMapKeys()
    timings.AddEvent(fcount, "funcs")

    if nsavederrors+nerrors != 0 {
        errorexit()
    }

    // Phase 4: Decide how to capture closed variables.
    // This needs to run before escape analysis,
    // because variables captured by value do not escape.
    timings.Start("fe", "capturevars")
    for _, n := range xtop {
        if n.Op == ODCLFUNC && n.Func.Closure != nil {
            Curfn = n
            capturevars(n)
        }
    }
    capturevarscomplete = true

    Curfn = nil

    if nsavederrors+nerrors != 0 {
        errorexit()
    }

    // Phase 5: Inlining
    timings.Start("fe", "inlining")
    if Debug_typecheckinl != 0 {
        // Typecheck imported function bodies if debug['l'] > 1,
        // otherwise lazily when used or re-exported.
        for _, n := range importlist {
            if n.Func.Inl != nil {
                saveerrors()
                typecheckinl(n)
            }
        }

        if nsavederrors+nerrors != 0 {
            errorexit()
        }
    }

    if Debug['l'] != 0 {
        // Find functions that can be inlined and clone them before walk expands them.
        visitBottomUp(xtop, func(list []*Node, recursive bool) {
            for _, n := range list {
                if !recursive {
                    caninl(n)
                } else {
                    if Debug['m'] > 1 {
                        fmt.Printf("%v: cannot inline %v: recursive\n", n.Line(), n.Func.Nname)
                    }
                }
                inlcalls(n)
            }
        })
    }

    // Phase 6: Escape analysis.
    // Required for moving heap allocations onto stack,
    // which in turn is required by the closure implementation,
    // which stores the addresses of stack variables into the closure.
    // If the closure does not escape, it needs to be on the stack
    // or else the stack copier will not update it.
    // Large values are also moved off stack in escape analysis;
    // because large values may contain pointers, it must happen early.
    timings.Start("fe", "escapes")
    escapes(xtop)

    // Collect information for go:nowritebarrierrec
    // checking. This must happen before transformclosure.
    // We'll do the final check after write barriers are
    // inserted.
    if compiling_runtime {
        nowritebarrierrecCheck = newNowritebarrierrecChecker()
    }

    // Phase 7: Transform closure bodies to properly reference captured variables.
    // This needs to happen before walk, because closures must be transformed
    // before walk reaches a call of a closure.
    timings.Start("fe", "xclosures")
    for _, n := range xtop {
        if n.Op == ODCLFUNC && n.Func.Closure != nil {
            Curfn = n
            transformclosure(n)
        }
    }

    // Prepare for SSA compilation.
    // This must be before peekitabs, because peekitabs
    // can trigger function compilation.
    initssaconfig()

    // Just before compilation, compile itabs found on
    // the right side of OCONVIFACE so that methods
    // can be de-virtualized during compilation.
    Curfn = nil
    peekitabs()

    // Phase 8: Compile top level functions.
    // Don't use range--walk can add functions to xtop.
    timings.Start("be", "compilefuncs")
    fcount = 0
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if n.Op == ODCLFUNC {
            funccompile(n)
            fcount++
        }
    }
    timings.AddEvent(fcount, "funcs")

    if nsavederrors+nerrors == 0 {
        fninit(xtop)
    }

    compileFunctions()

    if nowritebarrierrecCheck != nil {
        // Write barriers are now known. Check the
        // call graph.
        nowritebarrierrecCheck.check()
        nowritebarrierrecCheck = nil
    }

    // Finalize DWARF inline routine DIEs, then explicitly turn off
    // DWARF inlining gen so as to avoid problems with generated
    // method wrappers.
    if Ctxt.DwFixups != nil {
        Ctxt.DwFixups.Finalize(myimportpath, Debug_gendwarfinl != 0)
        Ctxt.DwFixups = nil
        genDwarfInline = 0
    }

    // Phase 9: Check external declarations.
    timings.Start("be", "externaldcls")
    for i, n := range externdcl {
        if n.Op == ONAME {
            externdcl[i] = typecheck(externdcl[i], ctxExpr)
        }
    }
    // Check the map keys again, since we typechecked the external
    // declarations.
    checkMapKeys()

    if nerrors+nsavederrors != 0 {
        errorexit()
    }

    // Write object data to disk.
    timings.Start("be", "dumpobj")
    dumpdata()
    Ctxt.NumberSyms(false)
    dumpobj()
    if asmhdr != "" {
        dumpasmhdr()
    }

    // Check whether any of the functions we have compiled have gigantic stack frames.
    sort.Slice(largeStackFrames, func(i, j int) bool {
        return largeStackFrames[i].pos.Before(largeStackFrames[j].pos)
    })
    for _, large := range largeStackFrames {
        if large.callee != 0 {
            yyerrorl(large.pos, "stack frame too large (>1GB): %d MB locals + %d MB args + %d MB callee", large.locals>>20, large.args>>20, large.callee>>20)
        } else {
            yyerrorl(large.pos, "stack frame too large (>1GB): %d MB locals + %d MB args", large.locals>>20, large.args>>20)
        }
    }

    if len(compilequeue) != 0 {
        Fatalf("%d uncompiled functions", len(compilequeue))
    }

    logopt.FlushLoggedOpts(Ctxt, myimportpath)

    if nerrors+nsavederrors != 0 {
        errorexit()
    }

    flusherrors()
    timings.Stop()

    if benchfile != "" {
        if err := writebench(benchfile); err != nil {
            log.Fatalf("cannot write benchmark data: %v", err)
        }
    }
}

粗略来看,Main的过程大致分为以下几个部分:

  1. 处理器架构初始化及上下文的关联
  2. 构建伪包,方便对应包中func的使用
  3. flags声明及解析,这是对命令参数解析的核心
  4. 参数检查,确认是否满足编译条件
  5. 解析文件,转换为语法树,构造node,存入xtop(解析的过程是并发的)
  6. 对xtop中数据依次进行处理
  7. 写入编译后的文件至磁盘

对xtop数据的处理过程具体可以分为以下几个阶段。

  1. const、type、var、func的类型检查,此步骤不处理赋值。
  2. 在1的基础上对变量赋值。
  3. 对func body进行类型检查。
    以上3步检查结束后会进行map keys的检查。
  4. 决定如何捕获闭合变量。
  5. 内联检查。
  6. 逃逸分析。
  7. 转换闭包体以正确引用捕获的变量。
    准备SSA编译。
  8. 编译顶级函数。
    func间的调用处理在此处进行。
  9. 检查外部声明。
    再次检查map keys。

以上是编译的大致过程,后续会关注细节处理。

总结

本文主要从源码及其注释的角度对compile的过程有个初步了解,在稍后的文章中我们会关注更具体的处理细节。概括一下,compile的过程:

解析文件->解析语法->类型检查->编译->写入文件

公众号

鄙人刚刚开通了公众号,专注于分享Go开发相关内容,望大家感兴趣的支持一下,在此特别感谢。

你可能感兴趣的:(golang,源码分析,go,compile,build)