[TOC]
In Go servers, each incoming request is handled in its own goroutine. Request handlers often start additional goroutines to access backends such as databases and RPC services. The set of goroutines working on a request typically needs access to request-specific values such as the identity of the end user, authorization tokens, and the request's deadline. When a request is canceled or times out, all the goroutines working on that request should exit quickly so the system can reclaim any resources they are using.
如上图,很多时候,尤其是分布式架构环境中,一个请求到达服务端后,会被拆分为若干个请求转发至相关的服务单元处理,如果一个服务单元返回结束信息(通常是错误造成的),其他服务单元都应该及时结束该请求的处理,以避免资源浪费在无意义的请求处理上。
正是因于此,Google开发了context包,提供对使用一组相同上下文(context)的goroutine的管理,及时结束无意义的请求处理goroutine。
这就成为一个亟待解决的问题。我们都知道在Go语言中,提倡“通过通信共享内存资源”,那么下发取消命令最简单直接的办法就是创建一个结束通道(done channel),各个服务单元(goroutine)根据channel来获取结束命令。
So easy,读值呗!有值就表示结束啊!
哈哈,事实并非如此,通道有非缓冲通道和缓冲通道,应该选择哪一种?通道中写什么值呢?是有值即结束还是根据值判断呢?
type Result struct {
status bool
value int
}
func thirtyAPI(done chan struct{}, num int, dst chan Result){
fmt.Printf("我正在调用第三方API:%d\n", num)
tc := make(chan int, 1)
i := 0
for {
// 业务逻辑代码
select {
case <-done:
fmt.Printf("%d: 我要结束了,Bye ThirtyAPI\n", num)
dst <- Result{status:false, value:num}
return
case tc <- i:
if num == 3 {
time.Sleep(time.Second)
dst <- Result{status: false, value:num}
return
}
i = <-tc
i++
}
}
}
func eg3() {
dst := make(chan Result, 5)
done := make(chan struct{})
for i:=0; i<5; i++{
go thirtyAPI(done, i, dst)
}
for result := range dst {
if result.status == false {
// 第一个false到来时,必须发布取消命令
fmt.Printf("%d: I met error\n", result.value)
done <- struct{}{}
break
}
}
}
func main() {
eg3()
time.Sleep(time.Second*5)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
分析一下运行结果,我们发现只有一个goroutine接收到结束命令
,其他的goroutine都未结束运行。这是因为代码中使用非缓冲通道造成的。
type Result struct {
status bool
value int
}
func thirtyAPI(done chan struct{}, num int, dst chan Result){
fmt.Printf("我正在调用第三方API:%d\n", num)
tc := make(chan int, 1)
i := 0
for {
// 业务逻辑代码
select {
case <-done:
fmt.Printf("%d: 我要结束了,Bye ThirtyAPI\n", num)
dst <- Result{status:false, value:num}
return
case tc <- i:
if num == 3 {
time.Sleep(time.Second)
dst <- Result{status: false, value:num}
return
}
if num == 4 {
dst <- Result{status: true, value: num}
return
}
i = <-tc
i++
}
}
}
func eg4() {
dst := make(chan Result, 5)
done := make(chan struct{}, 5)
for i:=0; i<5; i++{
go thirtyAPI(done, i, dst)
}
for result := range dst {
if result.status == false {
// 第一个false到来时,必须发布取消命令
fmt.Printf("%d: I met error\n", result.value)
done <- struct{}{}
done <- struct{}{}
done <- struct{}{}
done <- struct{}{}
done <- struct{}{}
break
} else {
fmt.Printf("%d: I have success\n", result.value)
}
}
}
func main() {
eg4()
time.Sleep(time.Second*5)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
分析一下结果,令人欣慰的是所有的goroutine都结束了,但是有两点缺陷,第一,写了五行done <- struct{}{}
是不是很垃圾?第二,在代码中实际受done通道指示结束运行的goroutine只有三条,是不是资源浪费?
其实,最致命的问题是采用缓存通道并不能真正的结束所有该退出的goroutine,想一想,如果在thirtyAPI中继续调用其他API怎么办?我们并不能在预知有多少个goroutine在运行!!!
在1.2.2中,我们知道我们无法预知实际有多少goroutine该执行结束,因而无法确定done channel的长度。
问题似乎不可解,我们不妨换个思路,既然写这条路走不通,那么可否不写呢?
A receive operation on a closed channel can always proceed immediately, yielding the element type's zero value.
当需要下发取消命令时,下发端只需要关闭done channel即可,这样所有需要退出的goroutine都能从done channel读取零值,也就都退出啦!
type Result struct {
status bool
value int
}
func thirtyAPI(done chan struct{}, num int, dst chan Result){
fmt.Printf("我正在调用第三方API:%d\n", num)
tc := make(chan int, 1)
i := 0
for {
// 业务逻辑代码
select {
case <-done:
fmt.Printf("%d: 我要结束了,Bye ThirtyAPI\n", num)
dst <- Result{status:false, value:num}
return
case tc <- i:
if num == 3 {
time.Sleep(time.Second)
dst <- Result{status: false, value:num}
return
}
if num == 4 {
dst <- Result{status: true, value: num}
return
}
i = <-tc
i++
}
}
}
func eg4() {
dst := make(chan Result, 5)
done := make(chan struct{}, 5)
defer close(done)
for i:=0; i<5; i++{
go thirtyAPI(done, i, dst)
}
for result := range dst {
if result.status == false {
// 第一个false到来时,必须发布取消命令
fmt.Printf("%d: I met error\n", result.value)
break
} else {
fmt.Printf("%d: I have success\n", result.value)
}
}
}
func main() {
eg4()
time.Sleep(time.Second*5)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
其实,Context也正是基于closed channel这个特性实现的。
type Context interface {
// Done returns a channel that is closed when this Context is canceled
// or times out.
Done() <-chan struct{}
// Err indicates why this context was canceled, after the Done channel
// is closed.
Err() error
// Deadline returns the time when this Context will be canceled, if any.
Deadline() (deadline time.Time, ok bool)
// Value returns the value associated with key or nil if none.
Value(key interface{}) interface{}
}
type Result struct {
status bool
value int
}
func thirtyAPI(ctx context.Context, num int, dst chan Result){
fmt.Printf("我正在调用第三方API:%d\n", num)
tc := make(chan int, 1)
i := 0
for {
// 业务逻辑代码
select {
case <-ctx.Done():
fmt.Printf("%d: 我要结束了,Error信息: %s\n", num, ctx.Err())
dst <- Result{status:false, value:num}
return
case tc <- i:
if num == 3 {
time.Sleep(time.Second)
dst <- Result{status: false, value:num}
return
}
if num == 4 {
dst <- Result{status: true, value: num}
return
}
i = <-tc
i++
}
}
}
func eg4() {
dst := make(chan Result, 5)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for i:=0; i<5; i++{
go thirtyAPI(ctx, i, dst)
}
for result := range dst {
if result.status == false {
// 第一个false到来时,必须发布取消命令
fmt.Printf("%d: I met error\n", result.value)
break
} else {
fmt.Printf("%d: I have success\n", result.value)
}
}
}
func main() {
eg4()
time.Sleep(time.Second*5)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
func gofunc(ctx context.Context) {
d, _ := ctx.Deadline()
for {
select {
case <-time.After(1 * time.Second):
fmt.Printf("Deadline:%v, Now:%v\n",d, time.Now())
case <-ctx.Done():
fmt.Println(ctx.Err())
return
}
}
}
func main() {
d := time.Now().Add(5 * time.Second)
ctx, cancel := context.WithDeadline(context.Background(), d)
fmt.Printf("Deadline:%v\n", d)
defer cancel()
go gofunc(ctx)
time.Sleep(time.Second*10)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
func main() {
type favContextKey string
f := func(ctx context.Context, k favContextKey) {
if v := ctx.Value(k); v != nil {
fmt.Println("found value:", v)
return
}
fmt.Println("key not found:", k)
}
k := favContextKey("language")
ctx := context.WithValue(context.Background(), k, "Go")
f(ctx, k)
f(ctx, favContextKey("color"))
}
The context package provides functions to derive new Context values from existing ones. These values form a tree: when a Context is canceled, all Contexts derived from it are also canceled.
WithCancel and WithTimeout return derived Context values that can be canceled sooner than the parent Context.
对子context的cancel操作,只会影响该子context及其子孙,并不影响其父辈及兄弟context。
package main
import (
"context"
"fmt"
"runtime"
"time"
)
func child(ctx context.Context, p, c int) {
fmt.Printf("Child Goroutine:%d-%d\n", p, c)
select {
case <-ctx.Done():
fmt.Printf("Child %d-%d: exited reason: %s\n", p, c, ctx.Err())
}
}
func parent(ctx context.Context, p int) {
fmt.Printf("Parent Goroutine:%d\n", p)
cctx, cancel := context.WithCancel(ctx)
defer cancel()
for i:=0; i<3; i++ {
go child(cctx, p, i)
}
if p==3 {
return
}
select {
case <- ctx.Done():
fmt.Printf("Parent %d: exited reason: %s\n", p, ctx.Err())
return
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for i:=0; i<5; i++ {
go parent(ctx, i)
}
time.Sleep(time.Second*3)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
A Context does not have a Cancel method for the same reason the Done channel is receive-only: the function receiving a cancelation signal is usually not the one that sends the signal. In particular, when a parent operation starts goroutines for sub-operations, those sub-operations should not be able to cancel the parent. Instead, the WithCancel function (described below) provides a way to cancel a new Context value.
Context自身是没有cancel方法的,主要原因是Done channel是只读通道。一般而言,接收取消信号的方法不应该是下发取消信号的。故而,父Goroutine不应该被其创建的子Goroutine取消。
但是,如果在子Goroutine中调用cancel函数,是不是也能取消父Goroutine呢?
package main
import (
"context"
"fmt"
"runtime"
"time"
)
func SubGor(ctx context.Context, p, c int, cancel context.CancelFunc) {
fmt.Printf("Child Goroutine:%d-%d\n", p, c)
if p==2 && c==2 {
cancel()
}
select {
case <-ctx.Done():
fmt.Printf("Child %d-%d: exited reason: %s\n", p, c, ctx.Err())
}
}
func Gor(ctx context.Context, p int,cancel context.CancelFunc) {
fmt.Printf("Goroutine:%d\n", p)
for i:=0; i<3; i++ {
go SubGor(ctx, p, i, cancel)
}
select {
case <- ctx.Done():
fmt.Printf("Parent %d: exited reason: %s\n", p, ctx.Err())
return
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for i:=0; i<3; i++ {
go Gor(ctx, i, cancel)
}
time.Sleep(time.Second*3)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
由示例代码可知,如果在子Goroutine调用cancel函数时,一样可以关闭父类Goroutine。但是,不建议这么做,因为它不符合逻辑,cancel应该交给具有cancel权限的人去做,千万不要越俎代庖。
Question:有没有想过context cancel的执行逻辑是什么样子的?
package main
import (
"context"
"fmt"
"runtime"
"time"
)
func dealDone(ctx context.Context, i int){
fmt.Printf("%d: deal done chan\n", i)
select{
case <-ctx.Done():
fmt.Printf("%d: exited, reason: %s\n", i, ctx.Err())
return
}
}
func notDealDone(ctx context.Context, i int) {
fmt.Printf("%d: not deal done chan\n",i)
for{
i++
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for i:=0; i<5; i++ {
if i==4 {
go notDealDone(ctx, i)
} else {
go dealDone(ctx, i)
}
}
time.Sleep(time.Second*3)
fmt.Println("Execute Cancel Func")
cancel()
time.Sleep(time.Second*3)
fmt.Printf("Current Active Goroutine Num: %d\n",runtime.NumGoroutine())
}
Programs that use Contexts should follow these rules to keep interfaces consistent across packages and enable static analysis tools to check context propagation:
func DoSomething(ctx context.Context, arg Arg) error {
// ... use ctx ...
}