源码角度解析Go语言并发[1]---M,P,G的定义,状态转换及一些"边角料"


转载至本人知乎文章!还望您点赞,收藏!


从这篇文章开始我将尝试从源码的角度解析Go语言并发之道。这次的源码解析可比python源码解析难度要大的多。鄙人不才,有问题还请指教。
啰嗦一句,还请阅读我之前的文章,了解协程和Go语言并发模型的基本知识。
go语言并发原理和机制【一】
go语言并发原理和机制【二】

目录

目录

1. Go程序入口——m0、g0

go语言并发模型调度器的源码大多集中在/runtime/文件夹之下。此文件夹之下有很多文件。包括 .s 类型的汇编码和 .go 类型的go语言源码。

首先编译器通过rt0_linux_arm64.s文件开启Go语言调度器。此文件名后半部分对应着不同的系统版本。

image

image

image

这些文件大多完成一些初始化工作。这里我选则研究 linux_arm64 版本。汇编略懂一些,但能力还不足以看工业级别的汇编码;不过它有注释~

(1)稍微看几段代码

下面创建了一个空的g0;它不运行代码程序,而是用于各种goroutine在m、p之间的调度。

image

这里创建了一个新线程,用于接下来runtime的初始化和返回;它就是m0,毕竟一个线程对应着一个m;不管有没有goroutine被创建,一个go进程总要有一个线程的。

第一行汇编代码就是跳转到创建线程的函数 _cgo_sys_thread_create(SB), R4;其中R4数值作为参数输入函数。

image

代码最后调用runtime.rt0_go(这就跳转至runtime/asm_linux_amd64.s中),初始化g0、m0;将其相互引用。


image

上述汇编代码调用了os_linux_arm64.go和proc.go中的许多函数。而go语言调度器源码,就在proc.go中。我们的重点就在于此,多的咱也不说了。

(2)那么上述创建的m0和g0有什么用呢?

总结一下:

g0和m0是在proc.go文件中的两个全局变量,m0就是进程启动后的初始线程g0也是代表着初始线程的stack。上文提到的汇编中新建的第一个线程就是m0,它在全局变量中, 无需在heap上分配,是一个脱离go本身内存分配机制的存在。而m0中的g0也是全局变量,上面提到的runtime.rt0_go中设置了很多g0的各个成员变量。

PS:其实每个都有自己的g0

每个之后创建的m也都有自己的g0,负责调度而不是执行用户程序里面的函数。
每个M可以运行各个goroutine,在结构体M的定义中有一个相对特殊的goroutine叫g0。g0的特殊之处在于它是带有调度栈的goroutine,下文就将其称为“m的g0栈“。Go在执行调度相关代码时,都是使用的m的g0栈。当一个g执行的是调度相关的代码时,它并不是直接在自己的栈中执行,而是先切换到m的g0栈然后再执行代码。
m的g0栈是一个特殊的栈,g0的分配和普通goroutine的分配过程不同,g0是在m建立时就生成的,并且给它分配的栈空间比较大,可以假定它的大小是足够大而不必使用分段栈。而普通的goroutine是在runtime.newproc时建立(后面会解释),并且初始栈空间分配得很小(4K),会在需要时增长。不仅如此,m的g0栈同时也是这个m对应的物理线程的栈。

参考:https://www.w3cschool.cn/go_internals/go_internals-419t283o.html


在此之前先对整个模型架构有一个了解。图源见图片描述。

http://lessisbetter.site/2019/03/26/golang-scheduler-2-macro-view/

2. M,P,G

上篇文章讲到Golang调度器有三个主要数据结构。

  1. M,操作系统的线程,被操作系统管理的,原生线程。
  2. G,goroutine,被Golang语言本身管理的线程,该结构体中包含一些指令或者调度的信息。
  3. P,调度的上下文,运行在M上的调度器。

他们的数据结构定义都在/runtime2中:

/src/runtime/runtime2.go​github.com

【下面我会展示其定义源码 ,概括一下都有哪些抽象定义;不过重点还是在其状态的定义上。至于分析源码,我将在讲解具体调度规则上做逐句分析。】

(1)G

type g struct {
    // Stack parameters.
    // stack describes the actual stack memory: [stack.lo, stack.hi).
    // stackguard0 is the stack pointer compared in the Go stack growth prologue.
    // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.
    // stackguard1 is the stack pointer compared in the C stack growth prologue.
    // It is stack.lo+StackGuard on g0 and gsignal stacks.
    // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).
    stack       stack   // offset known to runtime/cgo
    stackguard0 uintptr // offset known to liblink
    stackguard1 uintptr // offset known to liblink

    _panic         *_panic // innermost panic - offset known to liblink
    _defer         *_defer // innermost defer
    m              *m      // current m; offset known to arm liblink
    sched          gobuf
    syscallsp      uintptr        // if status==Gsyscall, syscallsp = sched.sp to use during gc
    syscallpc      uintptr        // if status==Gsyscall, syscallpc = sched.pc to use during gc
    stktopsp       uintptr        // expected sp at top of stack, to check in traceback
    param          unsafe.Pointer // passed parameter on wakeup
    atomicstatus   uint32
    stackLock      uint32 // sigprof/scang lock; TODO: fold in to atomicstatus
    goid           int64
    schedlink      guintptr
    waitsince      int64      // approx time when the g become blocked
    waitreason     waitReason // if status==Gwaiting
    preempt        bool       // preemption signal, duplicates stackguard0 = stackpreempt
    paniconfault   bool       // panic (instead of crash) on unexpected fault address
    preemptscan    bool       // preempted g does scan for gc
    gcscandone     bool       // g has scanned stack; protected by _Gscan bit in status
    gcscanvalid    bool       // false at start of gc cycle, true if G has not run since last scan; TODO: remove?
    throwsplit     bool       // must not split stack
    raceignore     int8       // ignore race detection events
    sysblocktraced bool       // StartTrace has emitted EvGoInSyscall about this goroutine
    sysexitticks   int64      // cputicks when syscall has returned (for tracing)
    traceseq       uint64     // trace event sequencer
    tracelastp     puintptr   // last P emitted an event for this goroutine
    lockedm        muintptr
    sig            uint32
    writebuf       []byte
    sigcode0       uintptr
    sigcode1       uintptr
    sigpc          uintptr
    gopc           uintptr         // pc of go statement that created this goroutine
    ancestors      *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors)
    startpc        uintptr         // pc of goroutine function
    racectx        uintptr
    waiting        *sudog         // sudog structures this g is waiting on (that have a valid elem ptr); in lock order
    cgoCtxt        []uintptr      // cgo traceback context
    labels         unsafe.Pointer // profiler labels
    timer          *timer         // cached timer for time.Sleep
    selectDone     uint32         // are we participating in a select and did someone win the race?

    // Per-G GC state

    // gcAssistBytes is this G's GC assist credit in terms of
    // bytes allocated. If this is positive, then the G has credit
    // to allocate gcAssistBytes bytes without assisting. If this
    // is negative, then the G must correct this by performing
    // scan work. We track this in bytes to make it fast to update
    // and check for debt in the malloc hot path. The assist ratio
    // determines how this corresponds to scan work debt.
    gcAssistBytes int64
}

G定义了一个比较重要的字段:atomicstatus,表示当前这个G的状态:

主要有_Gidle、_Grunnable、_Grunning、_Gsyscall和_Gwaiting五个状态;

其中_Gidle中被定义为iotaiota在文件“builtin.go”中声明为一个无类型整数序号 0;

// iota is a predeclared identifier representing the untyped integer ordinal
// number of the current const specification in a (usually parenthesized)
// const declaration. It is zero-indexed.
const iota = 0 // Untyped int

其他四个G状态的声明如下源码;我总结为下面这幅图:

image
const (
    // _Gidle means this goroutine was just allocated and has not
    // yet been initialized.
    _Gidle = iota // 0

    // _Grunnable means this goroutine is on a run queue. It is
    // not currently executing user code. The stack is not owned.
    _Grunnable // 1

    // _Grunning means this goroutine may execute user code. The
    // stack is owned by this goroutine. It is not on a run queue.
    // It is assigned an M and a P.
    _Grunning // 2

    // _Gsyscall means this goroutine is executing a system call.
    // It is not executing user code. The stack is owned by this
    // goroutine. It is not on a run queue. It is assigned an M.
    _Gsyscall // 3

    // _Gwaiting means this goroutine is blocked in the runtime.
    // It is not executing user code. It is not on a run queue,
    // but should be recorded somewhere (e.g., a channel wait
    // queue) so it can be ready()d when necessary. The stack is
    // not owned *except* that a channel operation may read or
    // write parts of the stack under the appropriate channel
    // lock. Otherwise, it is not safe to access the stack after a
    // goroutine enters _Gwaiting (e.g., it may get moved).
    _Gwaiting // 4

    // _Gmoribund_unused is currently unused, but hardcoded in gdb
    // scripts.
    _Gmoribund_unused // 5

    // _Gdead means this goroutine is currently unused. It may be
    // just exited, on a free list, or just being initialized. It
    // is not executing user code. It may or may not have a stack
    // allocated. The G and its stack (if any) are owned by the M
    // that is exiting the G or that obtained the G from the free
    // list.
    _Gdead // 6

    // _Genqueue_unused is currently unused.
    _Genqueue_unused // 7

    // _Gcopystack means this goroutine's stack is being moved. It
    // is not executing user code and is not on a run queue. The
    // stack is owned by the goroutine that put it in _Gcopystack.
    _Gcopystack // 8

    // _Gscan combined with one of the above states other than
    // _Grunning indicates that GC is scanning the stack. The
    // goroutine is not executing user code and the stack is owned
    // by the goroutine that set the _Gscan bit.
    //
    // _Gscanrunning is different: it is used to briefly block
    // state transitions while GC signals the G to scan its own
    // stack. This is otherwise like _Grunning.
    //
    // atomicstatus&~Gscan gives the state the goroutine will
    // return to when the scan completes.
    _Gscan         = 0x1000
    _Gscanrunnable = _Gscan + _Grunnable // 0x1001
    _Gscanrunning  = _Gscan + _Grunning  // 0x1002
    _Gscansyscall  = _Gscan + _Gsyscall  // 0x1003
    _Gscanwaiting  = _Gscan + _Gwaiting  // 0x1004
)

_Gscan与除_Grunning之外的上述状态之一结合,以表示GC正在扫描堆栈。因为状态的转换总是要涉及到堆栈的获取和释放,获取堆栈之前设置_Gscan位;_GscanXXX表示正在扫描,就相当于是互斥锁。

goroutine没有正在执行用户代码的话,堆栈就由设置_Gscan位的goroutine所拥有。上i面说了_Gscanrunning是不同的,因为:当GC给G发送信号以扫描它自己的堆栈时,它被用来暂时地阻止状态转换。其他方面就和_Grunning不同。

atomicstatus&~Gscan(就是atomicstatus和_Gscan的非(0X0111)进行与计算)给出了在扫描完成时goroutine将返回的状态。

除了一般表示G状态的作用,更像是一把控制线程堆栈的锁;因此,也就有了选择执行用户代码的能力。

(2)P

type p struct {
    id          int32
    status      uint32 // one of pidle/prunning/...  真线程的状态
    link        puintptr
    schedtick   uint32     // incremented on every scheduler call
    syscalltick uint32     // incremented on every system call
    sysmontick  sysmontick // last tick observed by sysmon
    m           muintptr   // back-link to associated m (nil if idle)
    mcache      *mcache
    raceprocctx uintptr

    deferpool    [5][]*_defer // pool of available defer structs of different sizes (see panic.go)
    deferpoolbuf [5][32]*_defer

    // Cache of goroutine ids, amortizes accesses to runtime·sched.goidgen.
    goidcache    uint64
    goidcacheend uint64

    // Queue of runnable goroutines. Accessed without lock.
    runqhead uint32
    runqtail uint32
    runq     [256]guintptr
    // runnext, if non-nil, is a runnable G that was ready'd by
    // the current G and should be run next instead of what's in
    // runq if there's time remaining in the running G's time
    // slice. It will inherit the time left in the current time
    // slice. If a set of goroutines is locked in a
    // communicate-and-wait pattern, this schedules that set as a
    // unit and eliminates the (potentially large) scheduling
    // latency that otherwise arises from adding the ready'd
    // goroutines to the end of the run queue.
    runnext guintptr

    // Available G's (status == Gdead)
    gFree struct {
        gList
        n int32
    }

    sudogcache []*sudog
    sudogbuf   [128]*sudog

    tracebuf traceBufPtr

    // traceSweep indicates the sweep events should be traced.
    // This is used to defer the sweep start event until a span
    // has actually been swept.
    traceSweep bool
    // traceSwept and traceReclaimed track the number of bytes
    // swept and reclaimed by sweeping in the current sweep loop.
    traceSwept, traceReclaimed uintptr

    palloc persistentAlloc // per-P to avoid mutex

    _ uint32 // Alignment for atomic fields below

    // Per-P GC state
    gcAssistTime         int64    // Nanoseconds in assistAlloc
    gcFractionalMarkTime int64    // Nanoseconds in fractional mark worker (atomic)
    gcBgMarkWorker       guintptr // (atomic)
    gcMarkWorkerMode     gcMarkWorkerMode

    // gcMarkWorkerStartTime is the nanotime() at which this mark
    // worker started.
    gcMarkWorkerStartTime int64

    // gcw is this P's GC work buffer cache. The work buffer is
    // filled by write barriers, drained by mutator assists, and
    // disposed on certain GC state transitions.
    gcw gcWork

    // wbBuf is this P's GC write barrier buffer.
    //
    // TODO: Consider caching this in the running G.
    wbBuf wbBuf

    runSafePointFn uint32 // if 1, run sched.safePointFn at next safe point

    pad cpu.CacheLinePad
}

可以看到P中定义了一些变量,用来表示线程上下文的“个人”信息(就像id,status,schedtick,syscalltick);还有与P相关联的m(muintptr,指向m的指针)、与P关联的G(run queue,运行队列);还有一些有关堆栈、有关实体线程信息的指针和缓存字段。

PS:链接在p上的run queue叫local list,此外还有global list,参考第2节开头的图。

其中status在调度中会经常变换,我们可以看看。包括Pidle、Prunning、Psyscall、Pgcstop、Pdead(下划线就不写了)。

Pilde表示:

  • 闲置的P;他没有执行用户代码,或者没有被调度;但是他在闲置P链表中,并且可以被调度;没有运行队列;
  • 它被闲置P链表拥有,或是其他正在转换其状态的东西。

Prunning表示:

  • 运行状态;正在执行用户代码,或者被调度;
  • 它被与之关联的M所拥有;
  • 只有此M可以转换其状态:没有G可工作——Pidle;系统调用——Psyscall;垃圾回收——Pgcstop;
  • M可以把P的拥有权直接转让给另一个M;

Psyscall表示:

  • 系统调用状态;P没用执行用户代码,因为G中代码执行系统调用去了;相当于M直接与G关联;
  • P可能还会和M有着一定的关系,但不被M拥有;处于一直游离状态,此时的P可能被其他的M偷走;
  • 与Pidle相似又有点不同;此时的P处于一种轻量级的过渡状态并且与M还有一些联系;
  • 当G中代码离开系统调用必须通过CAS操作重新获得P,或者从别处获取一个P;
  • PS:CAS操作是在修改共享变量的时候,用“检查”、“复制”的方式,代替“锁”;以此减少“获得-释放‘锁’”的开销。可以看看下面的链接和一篇80年代的论文(这个论文挺好的,我在操作系统课上还讲了):

Go并发编程之美-CAS操作 - 云+社区 - 腾讯云
论文 1981-tods-kung-robinson.pdf

  • 同时注意”A->B->A“陷阱:状态A变化为状态B再回到状态A;

Pgcstop表示:

  • 停止状态;此时的P规M所有,这个M是用来”停止整个程序运行“的;相当于是守护进程,有相当于是垃圾回收;
  • 同时P保留它的运行队列,程序重启时也会在空运行队列P上重启调度器;

Pdead表示:没了;

const (
    // P status

    // _Pidle means a P is not being used to run user code or the
    // scheduler. Typically, it's on the idle P list and available
    // to the scheduler, but it may just be transitioning between
    // other states.
    //
    // The P is owned by the idle list or by whatever is
    // transitioning its state. Its run queue is empty.
    _Pidle = iota

    // _Prunning means a P is owned by an M and is being used to
    // run user code or the scheduler. Only the M that owns this P
    // is allowed to change the P's status from _Prunning. The M
    // may transition the P to _Pidle (if it has no more work to
    // do), _Psyscall (when entering a syscall), or _Pgcstop (to
    // halt for the GC). The M may also hand ownership of the P
    // off directly to another M (e.g., to schedule a locked G).
    _Prunning

    // _Psyscall means a P is not running user code. It has
    // affinity to an M in a syscall but is not owned by it and
    // may be stolen by another M. This is similar to _Pidle but
    // uses lightweight transitions and maintains M affinity.
    //
    // Leaving _Psyscall must be done with a CAS, either to steal
    // or retake the P. Note that there's an ABA hazard: even if
    // an M successfully CASes its original P back to _Prunning
    // after a syscall, it must understand the P may have been
    // used by another M in the interim.
    _Psyscall

    // _Pgcstop means a P is halted for STW and owned by the M
    // that stopped the world. The M that stopped the world
    // continues to use its P, even in _Pgcstop. Transitioning
    // from _Prunning to _Pgcstop causes an M to release its P and
    // park.
    //
    // The P retains its run queue and startTheWorld will restart
    // the scheduler on Ps with non-empty run queues.
    _Pgcstop

    // _Pdead means a P is no longer used (GOMAXPROCS shrank). We
    // reuse Ps if GOMAXPROCS increases. A dead P is mostly
    // stripped of its resources, though a few things remain
    // (e.g., trace buffers).
    _Pdead
)

(3)M

type m struct {
    g0      *g     // goroutine with scheduling stack
    morebuf gobuf  // gobuf arg to morestack
    divmod  uint32 // div/mod denominator for arm - known to liblink

    // Fields not known to debuggers.
    procid        uint64       // for debuggers, but offset not hard-coded
    gsignal       *g           // signal-handling g
    goSigStack    gsignalStack // Go-allocated signal handling stack
    sigmask       sigset       // storage for saved signal mask
    tls           [6]uintptr   // thread-local storage (for x86 extern register)
    mstartfn      func()
    curg          *g       // current running goroutine
    caughtsig     guintptr // goroutine running during fatal signal
    p             puintptr // attached p for executing go code (nil if not executing go code)
    nextp         puintptr
    oldp          puintptr // the p that was attached before executing a syscall
    id            int64
    mallocing     int32
    throwing      int32
    preemptoff    string // if != "", keep curg running on this m
    locks         int32
    dying         int32
    profilehz     int32
    spinning      bool // m is out of work and is actively looking for work
    blocked       bool // m is blocked on a note
    newSigstack   bool // minit on C thread called sigaltstack
    printlock     int8
    incgo         bool   // m is executing a cgo call
    freeWait      uint32 // if == 0, safe to free g0 and delete m (atomic)
    fastrand      [2]uint32
    needextram    bool
    traceback     uint8
    ncgocall      uint64      // number of cgo calls in total
    ncgo          int32       // number of cgo calls currently in progress
    cgoCallersUse uint32      // if non-zero, cgoCallers in use temporarily
    cgoCallers    *cgoCallers // cgo traceback if crashing in cgo call
    park          note
    alllink       *m // on allm
    schedlink     muintptr
    mcache        *mcache
    lockedg       guintptr
    createstack   [32]uintptr // stack that created this thread.
    lockedExt     uint32      // tracking for external LockOSThread
    lockedInt     uint32      // tracking for internal lockOSThread
    nextwaitm     muintptr    // next m waiting for lock
    waitunlockf   func(*g, unsafe.Pointer) bool
    waitlock      unsafe.Pointer
    waittraceev   byte
    waittraceskip int
    startingtrace bool
    syscalltick   uint32
    thread        uintptr // thread handle
    freelink      *m      // on sched.freem

    // these are here because they are too large to be on the stack
    // of low-level NOSPLIT functions.
    libcall   libcall
    libcallpc uintptr // for cpu profiler
    libcallsp uintptr
    libcallg  guintptr
    syscall   libcall // stores syscall parameters on windows

    vdsoSP uintptr // SP for traceback while in VDSO call (0 if not in call)
    vdsoPC uintptr // PC for traceback while in VDSO call

    dlogPerM

    mOS
}

上述是M的结构。它对应着实体线程。可以看到它有一些对于线程的抽象,比如:procid-线程id、mallocing-分配内存,还有很多啦,等后面遇到了在做分析好了。

这里可以注意一下:spinning;

spinning:m处于一种像纺轮的状态,处于轮转的状态;此时的m没有可以工作的G,正在积极的寻找;后面我们会看到这样的场景。


3. 调度——框架

引用一篇文章(链接看图描述)的图片。他详细描述了Go并发调度的细节!并且标注了相应的函数,和它的工作原理。

ps:不知道看不看的清。看不清也没事,分块讲解的时候我会截小图。


https://segmentfault.com/a/1190000018876007

调度的机制用一句话描述:

runtime准备好G,P,M,然后M绑定P,M从各种队列中获取G,切换到G的执行栈上并执行G上的任务函数,调用goexit做清理工作并回到M,如此反复。

按照顺序,调度器启动:

  1. 创建m0、g0,关联它们;【main,main.main】
  2. 调度器初始化;【schedinit】
  3. 管理P列表;【procresize】
  4. 创建和管理G;【newproc,runqput】
  5. 运行和退出G;【execu、goexit0】
  6. 获取G(调度);【schedule、findrunnable】

4. 接下来讲什么?

好不容易这么长的寒假,确实不该浪费掉了。

接下来我准备根据(3)中的框架,讲解集中在/proc.go/中的源码。我们可以看到它具体执行了哪些操作,保存了什么变量;MPG的状态变换时,具体发什么了什么事情。

么么哒~

END

你可能感兴趣的:(源码角度解析Go语言并发[1]---M,P,G的定义,状态转换及一些"边角料")