目录
channel关闭原则(The Channel Closing Principle)
关闭channel的方法(不那么优雅)
关闭channel的方案(稍微优雅)
优雅地关闭channel的方案
1. M个接收者,1个发送者
2.一个接收者,N个发送者
3.M个接收者,N个发送者
4.“M个接收者,一个发送者”情况的变体
5.“N发送者”情况的一种变体
更多场景?
结论
本文翻译自《How to Gracefully Close Channels》
几天前,我写了一篇文章解释了go语言中channel使用方法。这篇文章在reddit和HN上获得了很多肯定,但也有一些关于channel设计细节的批评。
对channels的设计和规则的批评如下:
- 在不修改channel状态的情况下,没有简单通用的方法来检查channel是否关闭。
- 关闭一个已经关闭的的channel会导致panic,所以如果关闭者不知道channel是否关闭,那么关闭channel是危险的。
- 向关闭的channel发送数据会导致panic,因此,如果发送者不知道channel是否关闭,则向channel发送数据是危险的。
这些批评看起来合情合理(事实并非如此)。是的,确实没有内置函数来检查channel是否已经关闭。
如果没有任何数据被发送到(并且将被发送到)channel,那么确实有一种简单的方法可以检查channel是否关闭。该方法已在上一篇文章中介绍。这里,为了更好的连贯性,在下面的例子中再次列出该方法,如下:
package main
import "fmt"
type T int
func IsClosed(ch <-chan T) bool {
select {
case <-ch:
return true
default:
}
return false
}
func main() {
c := make(chan T)
fmt.Println(IsClosed(c)) // false
close(c)
fmt.Println(IsClosed(c)) // true
}
如上所述,这不是检查channel是否关闭的通用方法。
事实上,即使有一个内置函数来检查channel是否已关闭,它的作用也会非常有限,就像内置函数len()用于检查存储在channel缓冲区中的当前值数量一样。因为,在对此类函数的调用返回后,被检查channel的状态可能已经发生了变化,因此返回值已经无法反映刚被检测channel的最新状态。尽管如果调用closed(ch) 返回true,可以停止向chan ch发送值,但如果调用closed(ch) 返回false,则关闭channel或继续向channel发送值是不安全的。
使用go语言channel的一个一般原则是,不要从接收者关闭channel;如果channel有多个并发发送者,也不要关闭channel。换句话说,如果发送者是channel的唯一发送者,那么我们应该只关闭发送者goroutine协程中的channel。
(下面,我们将上述原则称为channel关闭原则。)
当然,这并不是关闭channel的普遍原则。通用原则是不要重复关闭channel或向关闭的channel发送值。如果我们可以保证没有goroutine将关闭并将值发送到非关闭的非nil的channel,那么goroutine可以安全地关闭channel。但实际上,由接收者或者channel的多个发送者之一来实现遵守channel关闭原则需要付出很多努力,并且常常使代码变得复杂。
如果你无论如何都要从接收端或在该channel的多个发送端中的一个关闭channel,那么可以使用recover机制来避免可能的panic导致程序崩溃。以下是一个示例(假设channel元素类型为 T)。
func SafeClose(ch chan T) (justClosed bool) {
defer func() {
if recover() != nil {
// The return result can be altered
// in a defer function call.
justClosed = false
}
}()
// assume ch != nil here.
close(ch) // panic if ch is closed
return true // <=> justClosed = true; return
}
上述方案显然打破了channel关闭原则。
同样的方案也可用于向可能已经关闭的channel发送值。
func SafeSend(ch chan T, value T) (closed bool) {
defer func() {
if recover() != nil {
closed = true
}
}()
ch <- value // panic if ch is closed
return false // <=> closed = false; return
}
上述不优雅的方案不仅打破了channel关闭原则,而且在这个过程中也可能发生data races。
许多人更喜欢使用sync.Once关闭channel:
type MyChannel struct {
C chan T
once sync.Once
}
func NewMyChannel() *MyChannel {
return &MyChannel{C: make(chan T)}
}
func (mc *MyChannel) SafeClose() {
mc.once.Do(func() {
close(mc.C)
})
}
当然,我们也可以使用sync.Mutex来避免重复关闭同一个channel
type MyChannel struct {
C chan T
closed bool
mutex sync.Mutex
}
func NewMyChannel() *MyChannel {
return &MyChannel{C: make(chan T)}
}
func (mc *MyChannel) SafeClose() {
mc.mutex.Lock()
defer mc.mutex.Unlock()
if !mc.closed {
close(mc.C)
mc.closed = true
}
}
func (mc *MyChannel) IsClosed() bool {
mc.mutex.Lock()
defer mc.mutex.Unlock()
return mc.closed
}
这些方案可能是稍微优雅的,但它们可能无法避免data races。目前,Go 规范不保证当关闭channel和向channel发送操作同时执行时不会发生data races。如果 SafeClose() 与同一channel的发送操作同时被调用,则可能会发生data races(尽管此类data races通常不会造成任何损害)。
上述 SafeSend() 的一个缺点是,它的调用不能用作select块中 case 分支的发送操作。上述 SafeSend() 和 SafeClose() 的另一个缺点是,包括我在内的许多人都会认为上述使panic/recover和sync包的解决方案并不优雅。下面将介绍一些不使用panic/recover和sync包的纯channel解决方案,适用于各种情况。
(在以下示例中,sync.WaitGroup 用于使示例完整。在实际实践中可能并不总是需要使用它。)
这是最简单的情况,只需让发送者在不想发送更多数据时关闭channel即可。
package main
import (
"time"
"math/rand"
"sync"
"log"
)
func main() {
rand.Seed(time.Now().UnixNano()) // needed before Go 1.20
log.SetFlags(0)
// ...
const Max = 100000
const NumReceivers = 100
wgReceivers := sync.WaitGroup{}
wgReceivers.Add(NumReceivers)
// ...
dataCh := make(chan int)
// the sender
go func() {
for {
if value := rand.Intn(Max); value == 0 {
// The only sender can close the
// channel at any time safely.
close(dataCh)
return
} else {
dataCh <- value
}
}
}()
// receivers
for i := 0; i < NumReceivers; i++ {
go func() {
defer wgReceivers.Done()
// Receive values until dataCh is
// closed and the value buffer queue
// of dataCh becomes empty.
for value := range dataCh {
log.Println(value)
}
}()
}
wgReceivers.Wait()
}
这比上面的情况稍微复杂一点的情况。我们不能让接收者关闭channel来停止数据传输,因为这样做会破坏channel关闭原则。但我们可以让接收者关闭一个额外的signal channel,通知发送者停止发送值。
package main
import (
"time"
"math/rand"
"sync"
"log"
)
func main() {
rand.Seed(time.Now().UnixNano()) // needed before Go 1.20
log.SetFlags(0)
// ...
const Max = 100000
const NumSenders = 1000
wgReceivers := sync.WaitGroup{}
wgReceivers.Add(1)
// ...
dataCh := make(chan int)
stopCh := make(chan struct{})
// stopCh is an additional signal channel.
// Its sender is the receiver of channel
// dataCh, and its receivers are the
// senders of channel dataCh.
// senders
for i := 0; i < NumSenders; i++ {
go func() {
for {
// The try-receive operation is to try
// to exit the goroutine as early as
// possible. For this specified example,
// it is not essential.
select {
case <- stopCh:
return
default:
}
// Even if stopCh is closed, the first
// branch in the second select may be
// still not selected for some loops if
// the send to dataCh is also unblocked.
// But this is acceptable for this
// example, so the first select block
// above can be omitted.
select {
case <- stopCh:
return
case dataCh <- rand.Intn(Max):
}
}
}()
}
// the receiver
go func() {
defer wgReceivers.Done()
for value := range dataCh {
if value == Max-1 {
// The receiver of channel dataCh is
// also the sender of stopCh. It is
// safe to close the stop channel here.
close(stopCh)
return
}
log.Println(value)
}
}()
// ...
wgReceivers.Wait()
}
如评论中所述,对于额外的signal channel,其发送者是数据channel的接收者。额外signal channel由其唯一的发送者关闭,该发送者遵循channel关闭原则。
在本例中,dataCh从不关闭。是的,channel不必关闭。如果没有goroutines协程再引用channel,不管它是否关闭,channel最终都将被垃圾收集。因此,在这里关闭channel的好处并不是关闭channel。
这是最复杂的情况。我们不能让任何接收者和发送者关闭数据channel。我们不能让任何一个接收者关闭一个额外的signal channel来通知所有发送者和接收者退出游戏。任何一种做法都将打破channel关闭原则。但是,我们可以引入一个协调人角色来关闭额外的signal channel。以下示例中的一个技巧是如何使用try-send操作来通知协调人关闭额外的signal channel。
package main
import (
"time"
"math/rand"
"sync"
"log"
"strconv"
)
func main() {
rand.Seed(time.Now().UnixNano()) // needed before Go 1.20
log.SetFlags(0)
// ...
const Max = 100000
const NumReceivers = 10
const NumSenders = 1000
wgReceivers := sync.WaitGroup{}
wgReceivers.Add(NumReceivers)
// ...
dataCh := make(chan int)
stopCh := make(chan struct{})
// stopCh is an additional signal channel.
// Its sender is the moderator goroutine shown
// below, and its receivers are all senders
// and receivers of dataCh.
toStop := make(chan string, 1)
// The channel toStop is used to notify the
// moderator to close the additional signal
// channel (stopCh). Its senders are any senders
// and receivers of dataCh, and its receiver is
// the moderator goroutine shown below.
// It must be a buffered channel.
var stoppedBy string
// moderator
go func() {
stoppedBy = <-toStop
close(stopCh)
}()
// senders
for i := 0; i < NumSenders; i++ {
go func(id string) {
for {
value := rand.Intn(Max)
if value == 0 {
// Here, the try-send operation is
// to notify the moderator to close
// the additional signal channel.
select {
case toStop <- "sender#" + id:
default:
}
return
}
// The try-receive operation here is to
// try to exit the sender goroutine as
// early as possible. Try-receive and
// try-send select blocks are specially
// optimized by the standard Go
// compiler, so they are very efficient.
select {
case <- stopCh:
return
default:
}
// Even if stopCh is closed, the first
// branch in this select block might be
// still not selected for some loops
// (and for ever in theory) if the send
// to dataCh is also non-blocking. If
// this is unacceptable, then the above
// try-receive operation is essential.
select {
case <- stopCh:
return
case dataCh <- value:
}
}
}(strconv.Itoa(i))
}
// receivers
for i := 0; i < NumReceivers; i++ {
go func(id string) {
defer wgReceivers.Done()
for {
// Same as the sender goroutine, the
// try-receive operation here is to
// try to exit the receiver goroutine
// as early as possible.
select {
case <- stopCh:
return
default:
}
// Even if stopCh is closed, the first
// branch in this select block might be
// still not selected for some loops
// (and forever in theory) if the receive
// from dataCh is also non-blocking. If
// this is not acceptable, then the above
// try-receive operation is essential.
select {
case <- stopCh:
return
case value := <-dataCh:
if value == Max-1 {
// Here, the same trick is
// used to notify the moderator
// to close the additional
// signal channel.
select {
case toStop <- "receiver#" + id:
default:
}
return
}
log.Println(value)
}
}
}(strconv.Itoa(i))
}
// ...
wgReceivers.Wait()
log.Println("stopped by", stoppedBy)
}
在上述例子中,channel关闭原则仍然成立。
请注意,toStop的缓冲区大小(容量)为1。这是为了避免在协调人goroutine准备好接收来自toStop的通知之前错过发送者的第一个通知。
我们还可以将toStop的容量设置为发送者和接收者的总数,这样我们就不需要try-send select块来通知协调人。
...
toStop := make(chan string, NumReceivers + NumSenders)
...
value := rand.Intn(Max)
if value == 0 {
toStop <- "sender#" + id
return
}
...
if value == Max-1 {
toStop <- "receiver#" + id
return
}
...
有时,关闭信号必须由第三方goroutine发出。对于这种情况,我们可以使用一个额外的signal channel来通知发送者关闭数据channel。例如
package main
import (
"time"
"math/rand"
"sync"
"log"
)
func main() {
rand.Seed(time.Now().UnixNano()) // needed before Go 1.20
log.SetFlags(0)
// ...
const Max = 100000
const NumReceivers = 100
const NumThirdParties = 15
wgReceivers := sync.WaitGroup{}
wgReceivers.Add(NumReceivers)
// ...
dataCh := make(chan int)
closing := make(chan struct{}) // signal channel
closed := make(chan struct{})
// The stop function can be called
// multiple times safely.
stop := func() {
select {
case closing<-struct{}{}:
<-closed
case <-closed:
}
}
// some third-party goroutines
for i := 0; i < NumThirdParties; i++ {
go func() {
r := 1 + rand.Intn(3)
time.Sleep(time.Duration(r) * time.Second)
stop()
}()
}
// the sender
go func() {
defer func() {
close(closed)
close(dataCh)
}()
for {
select{
case <-closing: return
default:
}
select{
case <-closing: return
case dataCh <- rand.Intn(Max):
}
}
}()
// receivers
for i := 0; i < NumReceivers; i++ {
go func() {
defer wgReceivers.Done()
for value := range dataCh {
log.Println(value)
}
}()
}
wgReceivers.Wait()
}
stop的使用来源于Roger Peppe的评论。
在上述N发送者情况的解决方案中,为了保持channel关闭原则,我们避免关闭数据channel。然而,有时要求数据channel最终必须关闭,以让接收者知道数据发送已经结束。对于这种情况,我们可以通过使用中间channel将N个发送者的情况转换为单发送者的情况。中间channel只有一个发送器,因此我们可以关闭它,而不是关闭原始数据channel。
package main
import (
"time"
"math/rand"
"sync"
"log"
"strconv"
)
func main() {
rand.Seed(time.Now().UnixNano()) // needed before Go 1.20
log.SetFlags(0)
// ...
const Max = 1000000
const NumReceivers = 10
const NumSenders = 1000
const NumThirdParties = 15
wgReceivers := sync.WaitGroup{}
wgReceivers.Add(NumReceivers)
// ...
dataCh := make(chan int) // will be closed
middleCh := make(chan int) // will never be closed
closing := make(chan string) // signal channel
closed := make(chan struct{})
var stoppedBy string
// The stop function can be called
// multiple times safely.
stop := func(by string) {
select {
case closing <- by:
<-closed
case <-closed:
}
}
// the middle layer
go func() {
exit := func(v int, needSend bool) {
close(closed)
if needSend {
dataCh <- v
}
close(dataCh)
}
for {
select {
case stoppedBy = <-closing:
exit(0, false)
return
case v := <- middleCh:
select {
case stoppedBy = <-closing:
exit(v, true)
return
case dataCh <- v:
}
}
}
}()
// some third-party goroutines
for i := 0; i < NumThirdParties; i++ {
go func(id string) {
r := 1 + rand.Intn(3)
time.Sleep(time.Duration(r) * time.Second)
stop("3rd-party#" + id)
}(strconv.Itoa(i))
}
// senders
for i := 0; i < NumSenders; i++ {
go func(id string) {
for {
value := rand.Intn(Max)
if value == 0 {
stop("sender#" + id)
return
}
select {
case <- closed:
return
default:
}
select {
case <- closed:
return
case middleCh <- value:
}
}
}(strconv.Itoa(i))
}
// receivers
for range [NumReceivers]struct{}{} {
go func() {
defer wgReceivers.Done()
for value := range dataCh {
log.Println(value)
}
}()
}
// ...
wgReceivers.Wait()
log.Println("stopped by", stoppedBy)
}
应该有更多的情况变体,但上面显示的是最常见和最基本的变体。通过巧妙地使用channel(和其他并发编程技术),我相信总能找到一个针对每个情况变量的channel关闭原理的解决方案。
任何场景都可以维持channel关闭原则。如果真的没法维持,请重新思考你的设计并重写你的代码。
使用Go channel编程就像创造艺术品。