本文主要概述一下golang的调度器的大概工作的流程,众所周知golang是基于用户态的协程的调度来完成多任务的执行。在Linux操作系统中,以往的多线程执行都是通过操作系统陷入内核来创建线程并提供给操作系统进行调度,在操作系统中的线程调度可以充分利用操作系统提供的各种资源,当线程执行到阻塞或者等待操作时,操作系统会休眠对应线程直到阻塞的事情来唤醒该线程继续执行,但是在通过操作系统创建的线程无论是阻塞还是调度都需要陷入内核,从而导致线程在这些过程中的开销较大。golang中的协程更多的是在用户态进行调度不需要陷入内核,但是同时这也限制了golang的调度策略并不能使用操作系统提供的阻塞唤醒或者抢占式调度的机制,本文主要就是探讨一下golang在用户态是如何进行调度执行。
golang主要根据CSP模型,通过通信进行数据交互,并且由于是实现的用户态的协程调度,但是本质上还是对应与操作系统的线程去详细执行对应的具体内容,故在golang中就设置了三种不同的模型分别为M,P和G。
Machine即对应于真正的操作系统创建的线程,这个线程的创建调度与运行都是受操作系统所控制,如果golang执行的是一个阻塞操作,那么该线程还是会阻塞,知道阻塞完成之后被操作系统唤醒并继续执行。
Processor就是虚拟的提供给g执行的上下文环境,该环境包括一个本地的g的队列,本地内存的对象等操作资源,只有M在绑定了P之后才能执行对应的G。
Groutine就是golang中对应的用户态的协程的具体内容,默认的用户态栈的大小是2KB,包括这执行任务的上下文的环境,在切换过程中保存执行的环境,调度器就是调度G到可执行的P中从而完成高效的并发调度操作。
三者整体的运行状态如图所示;
golang可能的一个运行状态图如上所示,从运行过程也可看出,G的调度过程都是在用户态进行的,接下来就分析一下调度的场景
在golang的初始化过程中,首先第一个M0就是初始化完成的M0,该M0就会在初始化完成之后调度执行对应的G,在golang的启动过程中可知,golang中的main函数其实也是对应的一个G来调度执行,如果在golang程序中启动协程来执行,并根据协程的执行情况或者现有的内核线程的工作情况来决定是否重新开启一个内核线程。
在拥有大量的G未执行的时候,或者是有的内核线程在执行系统调用阻塞的情况下,或者有些G长时间运行的情况,会根据情况来开启一个新的内核线程来执行可执行的G,从而确保G能够快速被执行。
在golang的启动过程中,会启动一个sysmon内核线程,该线程不知道具体的G内容,而是用来监控一些非阻塞的事件是否完成,监控各个正在被执行的G的运行时间,并从事抢占性调度的标志位的设置。
func newm(fn func(), _p_ *p) { // 生成内核工作线程
mp := allocm(_p_, fn) // 申请对应的内存设置新的栈信息
mp.nextp.set(_p_)
mp.sigmask = initSigmask
if gp := getg(); gp != nil && gp.m != nil && (gp.m.lockedExt != 0 || gp.m.incgo) && GOOS != "plan9" {
// We're on a locked M or a thread that may have been
// started by C. The kernel state of this thread may
// be strange (the user may have locked it for that
// purpose). We don't want to clone that into another
// thread. Instead, ask a known-good thread to create
// the thread for us.
//
// This is disabled on Plan 9. See golang.org/issue/22227.
//
// TODO: This may be unnecessary on Windows, which
// doesn't model thread creation off fork.
lock(&newmHandoff.lock)
if newmHandoff.haveTemplateThread == 0 {
throw("on a locked thread with no template thread")
}
mp.schedlink = newmHandoff.newm
newmHandoff.newm.set(mp)
if newmHandoff.waiting {
newmHandoff.waiting = false
notewakeup(&newmHandoff.wake)
}
unlock(&newmHandoff.lock)
return
}
newm1(mp) // 生成该工作线程
}
func newm1(mp *m) {
if iscgo {
var ts cgothreadstart
if _cgo_thread_start == nil {
throw("_cgo_thread_start missing")
}
ts.g.set(mp.g0)
ts.tls = (*uint64)(unsafe.Pointer(&mp.tls[0]))
ts.fn = unsafe.Pointer(funcPC(mstart))
if msanenabled {
msanwrite(unsafe.Pointer(&ts), unsafe.Sizeof(ts))
}
execLock.rlock() // Prevent process clone.
asmcgocall(_cgo_thread_start, unsafe.Pointer(&ts))
execLock.runlock()
return
}
execLock.rlock() // Prevent process clone.
newosproc(mp) // 系统调用线程 Linux主要是clone系统调用
execLock.runlock()
}
func newosproc(mp *m) {
stk := unsafe.Pointer(mp.g0.stack.hi) // 设置栈
/*
* note: strace gets confused if we use CLONE_PTRACE here.
*/
if false {
print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " clone=", funcPC(clone), " id=", mp.id, " ostk=", &mp, "\n")
}
// Disable signals during clone, so that the new thread starts
// with signals disabled. It will enable them in minit.
var oset sigset
sigprocmask(_SIG_SETMASK, &sigset_all, &oset)
ret := clone(cloneFlags, stk, unsafe.Pointer(mp), unsafe.Pointer(mp.g0), unsafe.Pointer(funcPC(mstart))) // 系统调用生成线程并设置g0堆栈开始执行mstart函数,从而重新开启一个线程执行
sigprocmask(_SIG_SETMASK, &oset, nil)
if ret < 0 {
print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", -ret, ")\n")
if ret == -_EAGAIN {
println("runtime: may need to increase max user processes (ulimit -u)")
}
throw("newosproc")
}
}
从流程可知,生成一个工作线程主要通过系统调用生成一个,生成完成之后再重新从mstart函数开始执行任务,重新开始去调度执行G。新增工作内核线程可能会在系统调用的过程中触发检查也可能在监控线程中通过retake函数触发。
// One round of scheduler: find a runnable goroutine and execute it.
// Never returns.
func schedule() {
_g_ := getg()
if _g_.m.locks != 0 {
throw("schedule: holding locks")
}
if _g_.m.lockedg != 0 {
stoplockedm()
execute(_g_.m.lockedg.ptr(), false) // Never returns.
}
// We should not schedule away from a g that is executing a cgo call,
// since the cgo call is using the m's g0 stack.
if _g_.m.incgo {
throw("schedule: in cgo")
}
top:
if sched.gcwaiting != 0 {
gcstopm()
goto top
}
if _g_.m.p.ptr().runSafePointFn != 0 {
runSafePointFn()
}
var gp *g
var inheritTime bool
if trace.enabled || trace.shutdown {
gp = traceReader()
if gp != nil {
casgstatus(gp, _Gwaiting, _Grunnable)
traceGoUnpark(gp, 0)
}
}
if gp == nil && gcBlackenEnabled != 0 {
gp = gcController.findRunnableGCWorker(_g_.m.p.ptr()) // 进行GC模式
}
if gp == nil {
// Check the global runnable queue once in a while to ensure fairness.
// Otherwise two goroutines can completely occupy the local runqueue
// by constantly respawning each other.
if _g_.m.p.ptr().schedtick%61 == 0 && sched.runqsize > 0 { // 为了公平每隔61个检查一下全局列表中是否有可执行的G如果有则执行
lock(&sched.lock)
gp = globrunqget(_g_.m.p.ptr(), 1) // 从全局列表中获取一个G
unlock(&sched.lock)
}
}
if gp == nil { // 如果全局没有获取到或者没从全局获取
gp, inheritTime = runqget(_g_.m.p.ptr()) // 从本地的p的队列中获取G
if gp != nil && _g_.m.spinning {
throw("schedule: spinning with local work") // 检查是否是自选
}
}
if gp == nil {
gp, inheritTime = findrunnable() // blocks until work is available 从其他地方获取G如果获取不到则阻塞在这里直到找到
}
// This thread is going to run a goroutine and is not spinning anymore,
// so if it was marked as spinning we need to reset it now and potentially
// start a new spinning M.
if _g_.m.spinning {
resetspinning()
}
if sched.disable.user && !schedEnabled(gp) {
// Scheduling of this goroutine is disabled. Put it on
// the list of pending runnable goroutines for when we
// re-enable user scheduling and look again.
lock(&sched.lock)
if schedEnabled(gp) {
// Something re-enabled scheduling while we
// were acquiring the lock.
unlock(&sched.lock)
} else {
sched.disable.runnable.pushBack(gp)
sched.disable.n++
unlock(&sched.lock)
goto top
}
}
if gp.lockedm != 0 {
// Hands off own p to the locked m,
// then blocks waiting for a new p.
startlockedm(gp)
goto top
}
execute(gp, inheritTime) // 找到之后就执行该G
}
调度函数主要执行的流程就是;
如果在上一步找到了可执行的G,则此时就会执行execute(gp, inheritTime)函数,执行该任务。
func execute(gp *g, inheritTime bool) {
_g_ := getg()
casgstatus(gp, _Grunnable, _Grunning) // 设置该G位运行可调用可运行状态
gp.waitsince = 0
gp.preempt = false // 是否抢占式调度标志位
gp.stackguard0 = gp.stack.lo + _StackGuard // 设置堆栈
if !inheritTime {
_g_.m.p.ptr().schedtick++
}
_g_.m.curg = gp
gp.m = _g_.m
// Check whether the profiler needs to be turned on or off.
hz := sched.profilehz
if _g_.m.profilehz != hz {
setThreadCPUProfiler(hz)
}
if trace.enabled {
// GoSysExit has to happen when we have a P, but before GoStart.
// So we emit it here.
if gp.syscallsp != 0 && gp.sysblocktraced {
traceGoSysExit(gp.sysexitticks)
}
traceGoStart()
}
gogo(&gp.sched) // 执行G对应的内容
}
主要就是进行了检查和设置标志位之后,再就调用gogo执行;
TEXT runtime·gogo(SB), NOSPLIT, $16-8
MOVQ buf+0(FP), BX // gobuf
MOVQ gobuf_g(BX), DX
MOVQ 0(DX), CX // make sure g != nil
get_tls(CX)
MOVQ DX, g(CX)
MOVQ gobuf_sp(BX), SP // restore SP 将gobuf中保存的现场内容回复
MOVQ gobuf_ret(BX), AX
MOVQ gobuf_ctxt(BX), DX
MOVQ gobuf_bp(BX), BP
MOVQ $0, gobuf_sp(BX) // clear to help garbage collector
MOVQ $0, gobuf_ret(BX)
MOVQ $0, gobuf_ctxt(BX)
MOVQ $0, gobuf_bp(BX)
MOVQ gobuf_pc(BX), BX // 将要执行的地址放入BX
JMP BX // 跳转执行该处代码
此时我们回到newproc1函数中创建G的过程中的时候,在G执行完成之后的执行地址设置成了goexit函数处。
newg.sched.pc = funcPC(goexit) + sys.PCQuantum // +PCQuantum so that previous instruction is in same function
此时查看goexit函数的执行过程;
// The top-most function running on a goroutine
// returns to goexit+PCQuantum.
TEXT runtime·goexit(SB),NOSPLIT,$0-0
BYTE $0x90 // NOP
CALL runtime·goexit1(SB) // does not return 调用goexit1
// traceback from goexit1 must hit code range of goexit
BYTE $0x90 // NOP
func goexit1() {
if raceenabled {
racegoend()
}
if trace.enabled {
traceGoEnd()
}
mcall(goexit0) // 切换到g0释放该执行完成的g
}
TEXT runtime·mcall(SB), NOSPLIT, $0-8
MOVQ fn+0(FP), DI
get_tls(CX)
MOVQ g(CX), AX // save state in g->sched
MOVQ 0(SP), BX // caller's PC
MOVQ BX, (g_sched+gobuf_pc)(AX)
LEAQ fn+0(FP), BX // caller's SP
MOVQ BX, (g_sched+gobuf_sp)(AX)
MOVQ AX, (g_sched+gobuf_g)(AX)
MOVQ BP, (g_sched+gobuf_bp)(AX)
// switch to m->g0 & its stack, call fn 切换栈
MOVQ g(CX), BX
MOVQ g_m(BX), BX
MOVQ m_g0(BX), SI
CMPQ SI, AX // if g == m->g0 call badmcall
JNE 3(PC)
MOVQ $runtime·badmcall(SB), AX
JMP AX
MOVQ SI, g(CX) // g = m->g0
MOVQ (g_sched+gobuf_sp)(SI), SP // sp = m->g0->sched.sp 调用g0的sched.sp的内容
PUSHQ AX
MOVQ DI, DX
MOVQ 0(DI), DI
CALL DI // 执行该函数
POPQ AX
MOVQ $runtime·badmcall2(SB), AX
JMP AX
RET
// goexit continuation on g0.
func goexit0(gp *g) {
_g_ := getg()
casgstatus(gp, _Grunning, _Gdead) // 设置状态为执行完成
if isSystemGoroutine(gp, false) {
atomic.Xadd(&sched.ngsys, -1)
}
gp.m = nil // 设置m为空
locked := gp.lockedm != 0 // 值重新置空
gp.lockedm = 0
_g_.m.lockedg = 0
gp.paniconfault = false
gp._defer = nil // should be true already but just in case.
gp._panic = nil // non-nil for Goexit during panic. points at stack-allocated data.
gp.writebuf = nil
gp.waitreason = 0
gp.param = nil
gp.labels = nil
gp.timer = nil
if gcBlackenEnabled != 0 && gp.gcAssistBytes > 0 {
// Flush assist credit to the global pool. This gives
// better information to pacing if the application is
// rapidly creating an exiting goroutines.
scanCredit := int64(gcController.assistWorkPerByte * float64(gp.gcAssistBytes))
atomic.Xaddint64(&gcController.bgScanCredit, scanCredit)
gp.gcAssistBytes = 0
}
// Note that gp's stack scan is now "valid" because it has no
// stack.
gp.gcscanvalid = true
dropg() // 将该G与M的关系
if GOARCH == "wasm" { // no threads yet on wasm
gfput(_g_.m.p.ptr(), gp)
schedule() // never returns
}
if _g_.m.lockedInt != 0 {
print("invalid m->lockedInt = ", _g_.m.lockedInt, "\n")
throw("internal lockOSThread error")
}
gfput(_g_.m.p.ptr(), gp) // 放入到空余列表中
if locked {
// The goroutine may have locked this thread because
// it put it in an unusual kernel state. Kill it
// rather than returning it to the thread pool.
// Return to mstart, which will release the P and exit
// the thread.
if GOOS != "plan9" { // See golang.org/issue/22227.
gogo(&_g_.m.g0.sched)
} else {
// Clear lockedExt on plan9 since we may end up re-using
// this thread.
_g_.m.lockedExt = 0
}
}
schedule() // 重新调度
}
至此一个正常的G的一个执行过程就完成了。函数的调用链路如下;
本文只是简单的概述了一下golang中的一些基本场景,然后分析了一下G的调度执行过程,其中有大量的细节还未涉及,只是简单的把正常的G的创建过程和执行流程梳理了一下,具体的调度策略和实现还需要进一步学习与了解。由于本人才疏学浅,如有错误请批评指正。