Golang sync.Once 源码浅析

本文分析了Golang sync.Once 源码,并由此引申,简单讨论了单例模式的实现、 atomic 包的作用和 Java volatile 的使用。

sync.Once 使用例子

sync.Once 用于保证一个函数只被调用一次。它可以用于实现单例模式。

有如下类型:

type instance struct {
	val int
}

假设我们需要单例模式,且需要将 instance 的初始化延迟到第一次访问它的时候,那么可以用 sync.Once:只需将单例的初始化函数传给 Once.Do,便可确保 initSingleton() 恰好执行一次。

var s *instance
var once sync.Once

func initSingleton() {
	s = new(instance)
	fmt.Println("instance is initializing...")
	time.Sleep(time.Second)
	s.val++
}

func GetInstance() *instance {
	once.Do(initSingleton)
	return s
}

多个 goroutine 并发调用 GetInstance() 仍能保证 initSingleton() 恰好执行一次。

sync.Once 实现原理

sync.Once 内部非常简单,只有一个标识传入的函数是否已经执行的无符号整型,以及一个互斥锁。

type Once struct {
	done uint32
	m    Mutex
}

由上述使用例子,多个 goroutine 调用 Do 仍能保证传入的函数恰好被执行一次。 Do 首先检查其 done 成员是否为零,若为零,说明初始化还未完成,这时加锁,重新检查 done 的值确保还未初始化,并调用初始化函数 f()。调用返回后,将 done 修改为1,指示已经初始化。

func (o *Once) Do(f func()) {
	if atomic.LoadUint32(&o.done) == 0 {
		// Outlined slow-path to allow inlining of the fast-path.
		o.doSlow(f)
	}
}

func (o *Once) doSlow(f func()) {
	o.m.Lock()
	defer o.m.Unlock()
	if o.done == 0 {
		defer atomic.StoreUint32(&o.done, 1)
		f()
	}
}

多个 goroutine 同时调用 Once.Do 会发生什么?

假设多个 goroutine 发现 done 的值为零,同时进入了 doSlow 方法,因为 doSlow 方法需要加锁,只有一个 goroutine 能够执行 f(),其余 goroutine 将阻塞。当执行 f() 的 goroutine 返回前更新 done 值后解锁,其余 goroutine 能够继续执行 doSlow,再次检查 done,发现已经不为零,说明在等待锁的间隙已经有其它 goroutine 调用 f() 完成了初始化,当前 goroutine 解锁并返回。

为什么加了锁之后不需要用原子读取函数 atomic.LoadUint32

这是因为互斥锁 m 保护了 done 字段不会被并发修改、读取。可以安全地读取 done。不同的是,doSlow 之前对 done 的读取必须是原子读取,否则这里将存在一个 data race。

为什么加锁后仍要用 atomic.StoreUint32,而不是直接赋值 done = 1

因为 done 不是 volatile 的,直接赋值无法保证可见性。也不能确保 done = 1 不被重排序到 f() 之前。关于 atomic load/store,参考如下:

What is the point of sync/atomic.(Load|Store)Int32 ?

However, the atomic load and store provide another property. If one processor executes “a = 1; b = 1” (let’s say that a and b are always 0 before) and another processor executes “if b { c = a }” then if the “b = 1” uses a non-atomic store, or the “if b” uses a non-atomic load, then it is entirely possible that the second processor will read the old value of a and set c to 0. That is, using a non-atomic load or store does not provide any ordering guarantees with regard to other memory that may have been set by the other processor.

You almost never care about only atomicity. There is also ordering (as Ian described) and visibility (loads/stores must be visible to other goroutines in a finite amount of time, this is not true for non-atomic loads/store). And there are also data races, which render behavior of your program undefined. All the same applies to C/C++ as well.

Why supporting atomic.Load and atomic.Store in Go?

Because of ordering guarantees, and memory operation visibility. For instance:
y:=0
x:=0
x=1
y=1
In the above program, another goroutine can see (0,0), (0,1), (1,0), or (1,1) for x and y. This is because of compiler reordering the code, compiler optimization,s or because of memory operation reordering at the hardware level. However:
y:=0
x:=0
x:=1
atomic.StoreInt64(&y,1)
If another goroutine sees atomic.LoadInt64(&y)==1, then the goroutine is guaranteed to see x=1.

为什么不能 atomic.CompareAndSwapUint32(&o.done, 0, 1) 判断为 true 后直接调用 f() 初始化?

如下所示:

func (o *Once) Do(f func()) {
	if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
		f()
	}
}

多个 goroutine 进入 Do 时,能够保证 f() 只被调用一次,但是不能保证 goroutine 返回时初始化已经完成。但是这种方法可以用于 Once 的异步实现。即一个 goroutine 发现该实例还未初始化完成,立刻返回并继续做其他事情。

单例的错误实现

sync.Once 利用 atomic 包实现了「只调用一次」的语义。可以只用一个互斥锁,先判断是否初始化,如果还没初始化,加锁,再判断是否已经初始化,才进行初始化。如下 GetInstanceV2() 所示。

package singleton

import (
	"sync"
)

type instance struct {
	val int
}

var s *instance
var once sync.Once
var mu sync.Mutex

func initSingleton() {
	s = new(instance)
	fmt.Println("instance is initializing...")
	time.Sleep(time.Second)
	s.val++
}

func GetInstance() *instance {
	once.Do(initSingleton)
	return s
}

func GetInstanceV2() *instance {
	// 先不加锁判断
	if s == nil {
		// 未初始化,加锁
		mu.Lock()
		defer mu.Unlock()
		// 加锁后重新判断
		if s == nil {
			// 进行初始化
			initSingleton()
		}
	}
	return s
}

事实上,在 GetInstanceV2 中第一次读取 s 没有加锁,又因为 s 不是 volatile 类型的(Go 也没有 volatile),当能够看到 s != nil 时,也不能保证 s 已经初始化完成,所以 GetInstanceV2 实现是有问题的。如果用 Java 实现,可以将 s 声明为 volatile,那么某线程初始化给 s 赋值后,其它线程能立刻看到 s != null

为了验证上述例子存在并发问题,编写测试用例如下:

func TestGetInstanceV2(t *testing.T) {
	var wg sync.WaitGroup
	for i := 0; i < 100; i++ {
		wg.Add(1)
		go func() {
			GetInstanceV2()
			wg.Done()
		}()
	}
	wg.Wait()
	assert.True(t, s.val == 1)
}

上述测试用例创建了 100 个 goroutine 同时调用 GetInstanceV2

测试如下:

go test -v -race -run TestGetInstanceV2

=== RUN   TestGetInstanceV2
==================
WARNING: DATA RACE
Read at 0x0000014380a8 by goroutine 9:
  ...
Previous write at 0x0000014380a8 by goroutine 8:
  ...
Goroutine 9 (running) created at:
  ...
Goroutine 8 (finished) created at:
  ...
==================
    testing.go:1312: race detected during execution of test
--- FAIL: TestGetInstanceV2 (0.01s)
=== CONT  
    testing.go:1312: race detected during execution of test
FAIL
exit status 1

上述报错说明了问题的存在。

Java 单例模式实现

附上 Java 的单例模式,实例必须声明为 volatile:

public class Singleton {  
    private volatile static Singleton singleton;  
    private Singleton (){}  
    public static Singleton getSingleton() {  
    if (singleton == null) {  
        synchronized (Singleton.class) {  
            if (singleton == null) {  
                singleton = new Singleton();  
            }  
        }  
    }  
    return singleton;  
    }  
}

类似错误情形

在 Go 语言中,同一个 Goroutine 线程内部,顺序一致性内存模型是得到保证的。但是不同的 Goroutine 之间,并不满足顺序一致性内存模型,需要通过明确定义的同步事件来作为同步的参考。

情形一

在 The Official Golang Blog 中描述了类似的情形:

Double-checked locking is an attempt to avoid the overhead of synchronization. For example, the twoprint program might be incorrectly written as:

var a string
var done bool

func setup() {
	// 先赋值,后设置 done
	a = "hello, world"
	done = true
}

func doprint() {
	if !done {
		once.Do(setup)
	}
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}

but there is no guarantee that, in doprint, observing the write to done implies observing the write to a. This version can (incorrectly) print an empty string instead of “hello, world”.

意思是说,doprint发现 donetrue 时,并不能确保它能看到 a 的值已经初始化。没有同步保证 a 先初始化,再设置 done

情形二

Another incorrect idiom is busy waiting for a value, as in:

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func main() {
	go setup()
	for !done {
	}
	print(a)
}

As before, there is no guarantee that, in main, observing the write to done implies observing the write to a, so this program could print an empty string too. Worse, there is no guarantee that the write to done will ever be observed by main, since there are no synchronization events between the two threads. The loop in main is not guaranteed to finish.

这是上一个例子的 busy waiting 变种,同样不能保证 a 先初始化再设置 done

情形三

There are subtler variants on this theme, such as this program.

type T struct {
	msg string
}

var g *T

func setup() {
	t := new(T)
	t.msg = "hello, world"	// 1
	g = t					// 2
}

func main() {
	go setup()
	for g == nil {
	}
	print(g.msg)
}

Even if main observes g != nil and exits its loop, there is no guarantee that it will observe the initialized value for g.msg.

上述错误更为隐晦,即使 main 发现 g 已经不为 nil 了,也无法保证 g.msg 已经设置,也就是说,不能确保代码中 语句1 和 语句2 的先后顺序。

你可能感兴趣的:(Go,java,单例模式,开发语言)