golang调度学习-调度流程 (三) wakep startm

本文主要讲wakep startm
wakep在newproc可能会调用（main起来之后）会调用wakep
startm在wakep 中调用
mstart 在rt0_go中调用，执行main
系统线程m

在golang中有三种系统线程：

主线程：golang程序启动加载的时候就运行在主线程上，代码中由一个全局的m0表示
运行sysmon的线程
普通用户线程，用来与p绑定，运行g中的任务的线程，
主线程和运行sysmon都是单实例，单独一个线程。而用户线程会有很多事例，他会根据调度器的需求新建，休眠和唤醒。

wakep startm

// Tries to add one more P to execute G's.
// Called when a G is made runnable (newproc, ready).
// 添加一个闲置的p来执行g
func wakep() {
    if atomic.Load(&sched.npidle) == 0 {                        // 没有限制的g, 返回
        return
    }
    // be conservative about spinning threads
    if atomic.Load(&sched.nmspinning) != 0 || !atomic.Cas(&sched.nmspinning, 0, 1) {    // 
        return
    }
    startm(nil, true)
}

// Schedules some M to run the p (creates an M if necessary).
// If p==nil, tries to get an idle P, if no idle P's does nothing.
// May run with m.p==nil, so write barriers are not allowed.
// If spinning is set, the caller has incremented nmspinning and startm will
// either decrement nmspinning or set m.spinning in the newly started M.
//
// Callers passing a non-nil P must call from a non-preemptible context. See
// comment on acquirem below.
//
// Must not have write barriers because this may be called without a P.
// 调度一些M去运行P
// 如果p不存在则从缓存获取P，没有P就返回
// 如果spinning是true，那个相应的startm需要减去nmspinning
// 如果调用者调用含有非空p，那么Callers就不能被抢占

//go:nowritebarrierrec
func startm(_p_ *p, spinning bool) {
    // Disable preemption.
    //
    // Every owned P must have an owner that will eventually stop it in the
    // event of a GC stop request. startm takes transient ownership of a P
    // (either from argument or pidleget below) and transfers ownership to
    // a started M, which will be responsible for performing the stop.
    // 每个拥有的P必须具有一个所有者，该所有者将在GC请求终止的情况下最终将其停止。 
    // startm取得P的临时所有权（从下面的参数或pidleget中获取），并将所有权转移给已启动的M，该M将负责执行停止。
    //
    // Preemption must be disabled during this transient ownership,
    // otherwise the P this is running on may enter GC stop while still
    // holding the transient P, leaving that P in limbo and deadlocking the
    // STW.
    // 在此短暂所有权期间，必须禁用抢占，否则正在运行的P可能会在仍然保持该暂态P的同时进入GC停止，
    // 从而使该P处于混乱状态并使STW死锁。
    
    // Callers passing a non-nil P must already be in non-preemptible
    // context, otherwise such preemption could occur on function entry to
    // startm. Callers passing a nil P may be preemptible, so we must
    // disable preemption before acquiring a P from pidleget below.
    // 传递非nil P的调用者必须已经在不可抢占的上下文中，否则这种抢占可能发生在向进入startm的函数时。 
    // 传递nil P的调用者可能是可抢占的，因此在从下面的pidleget获取P之前，我们必须先禁用抢占。
    mp := acquirem()                                    // 禁止抢占
    lock(&sched.lock)
    if _p_ == nil {
        _p_ = pidleget()
        if _p_ == nil {                                 // 没有空闲的p了，只能回去了                          
            unlock(&sched.lock)
            if spinning {
                // The caller incremented nmspinning, but there are no idle Ps,
                // so it's okay to just undo the increment and give up.
                if int32(atomic.Xadd(&sched.nmspinning, -1)) < 0 {
                    throw("startm: negative nmspinning")
                }
            }
            releasem(mp)                            // 可以抢占了
            return
        }
    }
    nmp := mget()                                   // 获取一个全局空闲的m
    if nmp == nil {                                 // 没有空闲的m了
        // No M is available, we must drop sched.lock and call newm.
        // However, we already own a P to assign to the M.
        //
        // Once sched.lock is released, another G (e.g., in a syscall),
        // could find no idle P while checkdead finds a runnable G but
        // no running M's because this new M hasn't started yet, thus
        // throwing in an apparent deadlock.
        //
        // Avoid this situation by pre-allocating the ID for the new M,
        // thus marking it as 'running' before we drop sched.lock. This
        // new M will eventually run the scheduler to execute any
        // queued G's.
        id := mReserveID()
        unlock(&sched.lock)

        var fn func()
        if spinning {
            // The caller incremented nmspinning, so set m.spinning in the new M.
            fn = mspinning
        }
        newm(fn, _p_, id)                   //  简单新建一个m，就可以回去了
        // Ownership transfer of _p_ committed by start in newm.
        // Preemption is now safe.
        releasem(mp)                        // 释放当前g的m，可以被抢占了
        return
    }
    unlock(&sched.lock)
    if nmp.spinning {
        throw("startm: m is spinning")
    }
    if nmp.nextp != 0 {
        throw("startm: m has p")
    }
    if spinning && !runqempty(_p_) {
        throw("startm: p has runnable gs")
    }
    // The caller incremented nmspinning, so set m.spinning in the new M.
    nmp.spinning = spinning             //标记该M是否在自旋
    nmp.nextp.set(_p_)                  // 暂存P
    notewakeup(&nmp.park)               // 唤醒M
    // Ownership transfer of _p_ committed by wakeup. Preemption is now
    // safe.
    releasem(mp)
}

startm主要完成任务:

如果_p_为空就获取缓存的_p_
如果没有空闲的m, new一个m并且初始化m, 包括创建go和gsignal, 新建系统线程，并且在上面执行mstart

如果有空闲的m, 唤醒m

newm

// Create a new m. It will start off with a call to fn, or else the scheduler.
// fn needs to be static and not a heap allocated closure.
// May run with m.p==nil, so write barriers are not allowed.
//
// id is optional pre-allocated m ID. Omit by passing -1.
//go:nowritebarrierrec
func newm(fn func(), _p_ *p, id int64) {
mp := allocm(_p_, fn, id)                   // new一个m并且初始化m, 包括创建go和gsignal
mp.doesPark = (_p_ != nil)                  // m是否应该挂起， P!=nil 就可以直接用p执行了 就不用挂起了
mp.nextp.set(_p_)
mp.sigmask = initSigmask 
if gp := getg(); gp != nil && gp.m != nil && (gp.m.lockedExt != 0 || gp.m.incgo) && GOOS != "plan9" {
    // We're on a locked M or a thread that may have been
    // started by C. The kernel state of this thread may
    // be strange (the user may have locked it for that
    // purpose). We don't want to clone that into another
    // thread. Instead, ask a known-good thread to create
    // the thread for us.
    //
    // This is disabled on Plan 9. See golang.org/issue/22227.
    //
    // TODO: This may be unnecessary on Windows, which
    // doesn't model thread creation off fork.
    lock(&newmHandoff.lock)
    if newmHandoff.haveTemplateThread == 0 {
        throw("on a locked thread with no template thread")
    }
    mp.schedlink = newmHandoff.newm
    newmHandoff.newm.set(mp)
    if newmHandoff.waiting {
        newmHandoff.waiting = false
        notewakeup(&newmHandoff.wake)
    }
    unlock(&newmHandoff.lock)
    return
}
// 关联真正的分配os thread
// 分配一个系统线程，且完成 g0上的栈分配
// 传入 mstart 函数，让线程执行 mstart
newm1(mp)
}

newm的主要任务:

new一个m并且初始化m, 包括创建go和gsignal
初始化一些参数

新建一个系统线程并且执行mstart

func newm1(mp *m) {
if iscgo {
    var ts cgothreadstart
    if _cgo_thread_start == nil {
        throw("_cgo_thread_start missing")
    }
    ts.g.set(mp.g0)
    ts.tls = (*uint64)(unsafe.Pointer(&mp.tls[0]))
    ts.fn = unsafe.Pointer(funcPC(mstart))
    if msanenabled {
        msanwrite(unsafe.Pointer(&ts), unsafe.Sizeof(ts))
    }
    execLock.rlock() // Prevent process clone.
    asmcgocall(_cgo_thread_start, unsafe.Pointer(&ts))
    execLock.runlock()
    return
}
execLock.rlock() // Prevent process clone.
newosproc(mp)
execLock.runlock()
}

Linux newosproc

// May run with m.p==nil, so write barriers are not allowed.
//go:nowritebarrier
func newosproc(mp *m) {
// 分配一个系统线程，且完成 g0上的栈分配
// 传入 mstart 函数，让线程执行 mstart
stk := unsafe.Pointer(mp.g0.stack.hi)
/*
 * note: strace gets confused if we use CLONE_PTRACE here.
 */
if false {
    print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " clone=", funcPC(clone), " id=", mp.id, " ostk=", &mp, "\n")
}

// Disable signals during clone, so that the new thread starts
// with signals disabled. It will enable them in minit.
var oset sigset
sigprocmask(_SIG_SETMASK, &sigset_all, &oset)
ret := clone(cloneFlags, stk, unsafe.Pointer(mp), unsafe.Pointer(mp.g0), unsafe.Pointer(funcPC(mstart)))
sigprocmask(_SIG_SETMASK, &oset, nil)

if ret < 0 {
    print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", -ret, ")\n")
    if ret == -_EAGAIN {
        println("runtime: may need to increase max user processes (ulimit -u)")
    }
    throw("newosproc")
}
}

Windows newosproc

// May run with m.p==nil, so write barriers are not allowed. This
// function is called by newosproc0, so it is also required to
// operate without stack guards.
//go:nowritebarrierrec
//go:nosplit
func newosproc(mp *m) {
// 分配一个系统线程，且完成 g0 和 g0上的栈分配
// 传入 mstart 函数，让线程执行 mstart
// We pass 0 for the stack size to use the default for this binary.
thandle := stdcall6(_CreateThread, 0, 0,
    funcPC(tstart_stdcall), uintptr(unsafe.Pointer(mp)),
    0, 0)

if thandle == 0 {
    if atomic.Load(&exiting) != 0 {
        // CreateThread may fail if called
        // concurrently with ExitProcess. If this
        // happens, just freeze this thread and let
        // the process exit. See issue #18253.
        lock(&deadlock)
        lock(&deadlock)
    }
    print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", getlasterror(), ")\n")
    throw("runtime.newosproc")
}

// Close thandle to avoid leaking the thread object if it exits.
stdcall1(_CloseHandle, thandle)
}

allocm

分配一个m，且不关联任何一个os thread

// Allocate a new m unassociated with any thread.
// Can use p for allocation context if needed.
// fn is recorded as the new m's m.mstartfn.
// id is optional pre-allocated m ID. Omit by passing -1.
//
// This function is allowed to have write barriers even if the caller
// isn't because it borrows _p_.
//
//go:yeswritebarrierrec
func allocm(_p_ *p, fn func(), id int64) *m {
_g_ := getg()
acquirem() // disable GC because it can be called from sysmon
if _g_.m.p == 0 {           // 为什么会可能没有绑定p呢
    // 把__p__和g.m相互绑定，并且把_p_.status 从_Pidle转为_Prunning
    acquirep(_p_) // temporarily borrow p for mallocs in this function
}

// Release the free M list. We need to do this somewhere and
// this may free up a stack we can use.
// mexit的时候会加到freem, m.gsignal会在那时候释放，这个结构
// 因为m是又new创建的，可以由gc释放
if sched.freem != nil {             
    lock(&sched.lock)
    var newList *m
    for freem := sched.freem; freem != nil; {
        if freem.freeWait != 0 {
            next := freem.freelink
            freem.freelink = newList
            newList = freem
            freem = next
            continue
        }
        // stackfree must be on the system stack, but allocm is
        // reachable off the system stack transitively from
        // startm.
        systemstack(func() {
            stackfree(freem.g0.stack)
        })
        freem = freem.freelink
    }
    sched.freem = newList
    unlock(&sched.lock)
}

mp := new(m)
mp.mstartfn = fn
mcommoninit(mp, id)  //在文章（一）有介绍，主要是创建gsignal， 并且把m加入allm

// In case of cgo or Solaris or illumos or Darwin, pthread_create will make us a stack.
// Windows and Plan 9 will layout sched stack on OS stack.
// 创建g0
if iscgo || mStackIsSystemAllocated() {     
    mp.g0 = malg(-1)
} else {
    mp.g0 = malg(8192 * sys.StackGuardMultiplier)
}
mp.g0.m = mp

if _p_ == _g_.m.p.ptr() {
    releasep()              // 相互_g_.m和__p__相互解绑
}
releasem(_g_.m)             // 可以抢占

return mp
}

_P_的作用是_g_.m为空的时候借用来申请堆内存的, 借完_p_.status 设置成_Pidle并且还回去
allocm 主要完成一下任务:

new一个m并且初始化m, 包括创建go和gsignal

引用文章

[1] Go语言内幕（6）：启动和内存分配初始化 https://studygolang.com/artic...