Java 并发编程实践 3.1 可见性(Visibility)

3.1. Visibility(可见性)
Visibility is subtle because the things that can go wrong are so counterintuitive. In a single-threaded environment, if you write a value to a variable and later read that variable with no intervening writes, you can expect to get the same value back. This seems only natural. It may be hard to accept at first, but when the reads and writes occur in different threads, this is simply not the case. In general, there is no guarantee that the reading thread will see a value written by another thread on a timely basis, or even at all. In order to ensure visibility of memory writes across threads, you must use synchronization.
可见性可能是非常微妙的,因为经常会有违反直觉的错误发生。在单线程环境中,如果你在某一个时刻写入变量,过了一段时间之后读取变量,如果这之间没有写入的话,你应该可以得到相同的值。这看上去非常自然。尽管看上去好像无法理解,当在多线程环境中进行读写操作的时候,事情就有有一些不同。基本上,无法保证读线程可以准时的获取到写线程写入的值。为了确保写操作对于其他线程的内存可见性,你必须使用同步机制。
NoVisibility in Listing 3.1 illustrates what can go wrong when threads share data without synchronization. Two threads, the main thread and the reader thread, access the shared variables ready and number. The main thread starts the reader thread and then sets number to 42 and ready to true. The reader thread spins until it sees ready is true, and then prints out number. While it may seem obvious that NoVisibility will print 42, it is in fact possible that it will print zero, or never terminate at all! Because it does not use adequate synchronization, there is no guarantee that the values of ready and number written by the main thread will be visible to the reader thread.
Listing3.1中的代码展现了如果不使用同步,线程共享数据的时候可能会出错。主线程和读线程都会访问共享变量ready和number。主线程创建了Reader线程,然后把number设置成42,ready设成true。当Reader线程会陷入死循环中,一直等到发现准备好为止。很明显NoVisibility类会打印42.实际上也有可能会打印0或者永远不会停止。因为没有利用足够的同步机制,这样就无法确保ready和number的值被主线程的修改对读线程是可见的。
Listing 3.1. Sharing Variables without Synchronization. Don't Do this.

public class NoVisibility { 
    private static boolean ready; 
    private static int number; 

    private static class ReaderThread extends Thread { 
        public void run() { 
            while (!ready) 
                Thread.yield(); 
            System.out.println(number); 
        } 
    } 

    public static void main(String[] args) { 
        new ReaderThread().start(); 
        number = 42; 
        ready = true; 
    } 
} 

NoVisibility could loop forever because the value of ready might never become visible to the reader thread. Even more strangely, NoVisibility could print zero because the write to ready might be made visible to the reader thread before the write to number, a phenomenon known as reordering. There is no guarantee that operations in one thread will be performed in the order given by the program, as long as the reordering is not detectable from within that thread even if the reordering is apparent to other threads.[1] When the main thread writes first to number and then to done without synchronization, the reader thread could see those writes happen in the opposite order or not at all.
NoVisibility类可能会无限循环下去,因为ready值可能对Reader线程来说一直是不可见的。在更为特殊的情况下,NoVisibility甚至可能会打印出0.因为可能会在写入number之前写入ready值,这是由于著名的“reordering”现象。尽管“recordering”对其他线程可能是可以被察觉的,但是只要在某一个线程内部该现象无法被察觉,那么就无法确保操作以程序中给定的顺序执行。当主线程在没有同步机制的情况下,首先写入数值,然后再去设置ready的值的话,Reader线程可能会察觉到写入是以相反的顺序发生的,甚至可能根本就没有发生。
[1] This may seem like a broken design, but it is meant to allow JVMs to take full advantage of the performance of modern multiprocessor hardware. For example, in the absence of synchronization, the Java Memory Model permits the compiler to reorder operations and cache values in registers, and permits CPUs to reorder operations and cache values in processor-specific caches. For more details, see Chapter 16.
这可能看上去是一个很差的设计,但是却可以使得JVM充分的利用现代多处理器硬件的所有优势。例如,在没有同步机制的情况下,Java的内存模型允许编译器打乱操作顺序并且可以在寄存器中缓存数值。允许cpu打乱指令的操作顺序,并且可以在处理器级别的缓存中缓存数值。第十六章中,有更详细的描述。
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions “must” happen in insufflciently synchronized multithreaded programs will almost certainly be incorrect.
在没有同步机制的情况下,编译器、处理器以及运行时环境可能会彻底将表明的执行指令弄乱。在没有足够同步机制的多线程程序中,要想弄清楚内存动作的顺序可能会是错误的。
NoVisibility is about as simple as a concurrent program can get two threads and two shared variables and yet it is still all too easy to come to the wrong conclusions about what it does or even whether it will terminate. Reasoning about insufficiently synchronized concurrent programs is prohibitively difficult.
NoVisibility类非常简单的并发程序,只拥有两个线程和两个共享的变量,即便如此,我们还是很容易就得到了错误的结果,甚至可能会陷入死循环。并发程序同步机制不够的问题是极其难以发现的。
This may all sound a little scary, and it should. Fortunately, there's an easy way to avoid these complex issues: always use the proper synchronization whenever data is shared across threads.
可能听上去有些吓人,但是事实上的确是这样。幸运的是,有一种简单的方法可以避免这种复杂性,只要数据在线程之间被共享,就要永远使用合适的同步机制。
3.1.1. Stale Data(过期数据)
NoVisibility demonstrated one of the ways that insufficiently synchronized programs can cause surprising results: stale data. When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable. Worse, staleness is not all-or-nothing: a thread can see an up-to-date value of one variable but a stale value of another variable that was written first.

NoVisibility类为我们展示了缺少足够的同步机制可能会引起令人诧异的结果:过期数据。当Reader线程检查ready值的时候,可能会看到过期值。除非在每个变量被访问的时候都是用同步机制,否则都有可能会看到某个变量的过期值。更为糟糕的是,过期并不是“all-or-nothing”的,一个线程可能会看到一个变量的最新值和另外一个变量的过期值。
When food is stale, it is usually still edible just less enjoyable. But stale data can be more dangerous. While an out-of-date hit counter in a web application might not be so bad,[2] stale values can cause serious safety or liveness failures. In NoVisibility, stale values could cause it to print the wrong value or prevent the program from terminating. Things can get even more complicated with stale values of object references, such as the link pointers in a linked list implementation. Stale data can cause serious and confusing failures such as unexpected exceptions, corrupted data structures, inaccurate computations, and infinite loops.
当食物过期时候,食物仍然是可以吃的,只不过是没有那么美味而已。如果数据过期的话,那危险就来了。如果一个web应用的“点击计数”过期的话,情况或许并不算太糟糕。过期数据可能会引起严重的安全和存活性问题。在NoVisibility类中,过期数据可能会引起数据错误或者导致死循环。如果对象的应用变成过期数据的话,情况会更加复杂。
[2] Reading data without synchronization is analogous to using the READ_UNCOMMITTED isolation level in a database, where you are willing to trade accuracy for performance. However, in the case of unsynchronized reads, you are trading away a greater degree of accuracy, since the visible value for a shared variable can be arbitrarily stale.
在没有同步机制的情况下读取数据有点儿像在数据库中使用“未提交”的隔离级别,这样做可以获得比较高的性能。但是对于非同步读来说,你将会失去比较高的准确度,因为某一个共享变量的可见值可能是过期的。
MutableInteger in Listing 3.2 is not thread-safe because the value field is accessed from both get and set without synchronization. Among other hazards, it is susceptible to stale values: if one thread calls set, other threads calling get may or may not see that update.
由于值域被get和set方法在没有同步机制的情况下访问,Listing3.2中的MutableInteger类不是线程安全的。与其他并发威胁相比,过期数据可能是易受影响的。如果一个线程调用set方法,那么其它线程可能会也可能不会看到数据的更改。
We can make MutableInteger tHRead safe by synchronizing the getter and setter as shown in SynchronizedInteger in Listing 3.3. Synchronizing only the setter would not be sufficient: threads calling get would still be able to see stale values.
可以通过使用同步的getter和setter方法将MutableInteger设置成线程安全的。只同步setter方法是不够的,调用get方法的线程还是会看到过期数据。
Listing 3.2. Non-thread-safe Mutable Integer Holder.

@NotThreadSafe 
public class MutableInteger { 
    private int value; 

    public int  get() { return value; } 
    public void set(int value) { this.value = value; } 
} 

Listing 3.3. Thread-safe Mutable Integer Holder.
@ThreadSafe 
public class SynchronizedInteger { 
    @GuardedBy("this") private int value; 

    public synchronized int get() { return value; } 
    public synchronized void set(int value) { this.value = value; } 
} 


3.1.2. Nonatomic 64-bit Operations(非原子的64比特操作)
When a thread reads a variable without synchronization, it may see a stale value, but at least it sees a value that was actually placed there by some thread rather than some random value. This safety guarantee is called out-of-thin-air safety.
当线程在没有同步机制的情况下读取变量的时候,线程可能会看到过期数据,但是至少这个数据是曾经被后一个线程放上的,而至于是一个随机数据。这种安全保证被称为“out-of-thin-air”。
Out-of-thin-air safety applies to all variables, with one exception: 64-bit numeric variables (double and long) that are not declared volatile (see Section 3.1.4). The Java Memory Model requires fetch and store operations to be atomic, but for nonvolatile long and double variables, the JVM is permitted to treat a 64-bit read or write as two separate 32-bit operations. If the reads and writes occur in different threads, it is therefore possible to read a nonvolatile long and get back the high 32 bits of one value and the low 32 bits of another.[3] Thus, even if you don't care about stale values, it is not safe to use shared mutable long and double variables in multithreaded programs unless they are declared volatile or guarded by a lock.
[3] When the Java Virtual Machine Specification was written, many widely used processor architectures could not efficiently provide atomic 64-bit arithmetic operations.
“Out-of-thin-air”适用于所有的变量操作,但是还是有例外,这就是那些没有被设置成volatile的64比特的数字操作(double和long)。JVM要求读取和存储数据必须是原子的,但是对于非volatile的long型和double型,JVM允许当做两个单独的32比特进行读写操作。如果读写操作出现在不同的线程中的时候,这就有可能会出现读取到一个nonvolatile的长整型或者得到高位的32比特或者低位的32比特。这样即使你不去关心过期数据的问题,在多线程环境中使用可变的long和double变量也是不安全的。除非他们被声明为volatile的或者被锁守护。当Java虚拟机规范制定的时候,很多被广泛使用的处理器架构无法有效地提供原子的64比特的算术操作。
3.1.3. Locking and Visibility(锁和可见性)
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
内在的锁机制可以用来保证一个线程使用一种可以预期的方式看到另外一个线程的结果,这种方式在图3.1中有所体现。当线程A执行完同步代码块之后,接着线程B进入了被同一把锁所保护的同步代码块,当B获得锁的时候,在释放锁前对线程A可见的数据都被赋予给线程B。也就是说,线程A在同步锁被授予线程B过程中和之前的所有事情当线程B执行该同步代码块的时候都是可见的。如果没有同步机制,就不会有这样的保证。
Figure 3.1. Visibility Guarantees for Synchronization.
We can now give the other reason for the rule requiring all threads to synchronize on the same lock when accessing a shared mutable variable to guarantee that values written by one thread are made visible to other threads. Otherwise, if a thread reads a variable without holding the appropriate lock, it might see a stale value.
当访问可变的共享变量时,所有线程使用同一把锁进行同步以保证其中一个线程对变量的更改对其他线程课件。现在我们有了另外一个使用这个规则的理由。否则,如果一个线程在没有使用恰当的锁的时候读取变量,他就将看到一个过期变量。
Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.
锁机制并非不仅仅对于互斥操作有意义,它同样对于内存共享有意义。为了确保所有线程能够看到大多数共享可变变量的正确数值,读写线程都必须使用同一把锁进行同步。


3.1.4. Volatile Variables(volatile变量)
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Java语言提供一种可选的,非标准的同步机制-volatile变量,来保证对某个变量的修改以可以预见的形式被其他线程获得。当一个变量被声明为volatile类型之后,编译器和运行时环境就会住注意到该变量是被共享的,这样这个变量之上的操作就不会与其他内存操作打乱时序。Volatile变量不会再寄存器中缓存也不会对其他处理器隐藏,因此读取一个volatile类型的变量将肯定会返回被线程修改后的最新值。
A good way to think about volatile variables is to imagine that they behave roughly like the SynchronizedInteger class in Listing 3.3, replacing reads and writes of the volatile variable with calls to get and set.[4] Yet accessing a volatile variable performs no locking and so cannot cause the executing thread to block, making volatile variables a lighter-weight synchronization mechanism than synchronized.[5]
[4] This analogy is not exact; the memory visibility effects of SynchronizedInteger are actually slightly stronger than those of volatile variables. See Chapter 16.
[5] Volatile reads are only slightly more expensive than nonvolatile reads on most current processor architectures.
当时volatile变量没有锁机制,因此不能将线程变成阻塞状态,这使得volatile变量成为同步机制的一种轻量级实现。
这种模拟其实并不精确,SynchronizedInteger类的内存可见效果实际上稍稍强于使用volatile变量,详情可以查看第十六章。
在大部分处理器架构下,Volatile方式的读取的时间效率只比非volatile形式的读取稍微第一点儿。
The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable,
the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block. However, we do not recommend relying too heavily on volatile variables for visibility; code that relies on volatile variables for visibility of arbitrary state is more fragile and harder to understand than code that uses locking.
Volatile变量的可见性效果超过了volatile变量本身。当线程A写入一个volatile类型的变量,而线程B接着去读取同一个变量的时候,在修改volatile变量之前对A所有可见的所有变量值在线程B读取到volatile变量值后都变成可见的。依赖于volatile变量来获取任意状态可见比使用锁机制更加更脆弱,也更加难以理解。
Use volatile variables only when they simplify implementing and verifying your synchronization policy; avoid using volatile variables when veryfing correctness would require subtle reasoning about visibility. Good uses of volatile variables include ensuring the visibility of their own state, that of the object they refer to, or indicating that an important lifecycle event (such as initialization or shutdown) has occurred.
当代码逻辑的实现非常简单,或者验证你的同步策略的时候,才会用到volatile变量。如果代码的正确性验证需要考虑到可见性的时候,不要使用volatile变量。对于volatile变量的正确使用方式包括:确保volatile变量自身状态的可见性-也就是他们所在的对象的状态,或者指示一个重要的生命周期事件(例如初始化或者关闭)的发生。

Listing 3.4 illustrates a typical use of volatile variables: checking a status flag to determine when to exit a loop. In this example, our anthropomorphized thread is trying to get to sleep by the time-honored method of counting sheep. For this example to work, the asleep flag must be volatile. Otherwise, the thread might not notice when asleep has been set by another thread.[6] We could instead have used locking to ensure visibility of changes to asleep, but that would have made the code more cumbersome.
Listing3.4中是一个volatile变量的典型应用:通过检查一个flag状态决定什么时候跳出loop。在这个例子中,人格化的线程通过传统的数数的方法来进入睡眠。这个例子能够实现,asleep标识位必须是volatile的。否则,当asleep被别的线程修改的时候,人格化的线程可能并不会注意到这种修改。我们可以使用锁机制来确保asleep标识的可见性,但是这将会使得代码非常显得笨重。
[6] Debugging tip: For server applications, be sure to always specify the -server JVM command line switch when invoking the JVM, even for development and testing. The server JVM performs more optimization than the client JVM, such as hoisting variables out of a loop that are not modified in the loop; code that might appear to work in the development environment (client JVM) can break in the deployment environment (server JVM). For example, had we "forgotten" to declare the variable asleep as volatile in Listing 3.4, the server JVM could hoist the test out of the loop (turning it into an infinite loop), but the client JVM would not. An infinite loop that shows up in development is far less costly than one that only shows up in production.
调试提示:对于服务器应用来说,即使在开发和测试过程中,当需要激活JVM的时候,一定要确保指定-server命令的使用。Server端的JVM比客户端的JVM实现了更多的优化,例如提升在一个循环中没有被修改的变量,在开发环境中可用的代码可能会在部署环境中出错。例如我们可能会忘记向Listing3.4中那样忘记声明volatile变量。Server模式的JVM将可能把检查从循环中提取出来(变成一个无限循环),但是client模式的JVM不会这样做。出现开发中的无限循环所带来的代价远远低于在产品中出现。
Listing 3.4. Counting Sheep.
volatile boolean asleep;
...
    while (!asleep)
        countSomeSheep();

Volatile variables are convenient, but they have limitations. The most common use for volatile variables is as a completion, interruption, or status flag, such as the asleep flag in Listing 3.4. Volatile variables can be used for other kinds of state information, but more care is required when attempting this. For example, the semantics of volatile are not strong enough to make the increment operation (count++) atomic, unless you can guarantee that the variable is written only from a single thread. (Atomic variables do provide atomic read-modify-write support and can often be used as "better volatile variables"; see Chapter 15.)
Voltile类型的变量既有其方便之处,也有其使用限制。Volatile变量通常会用来作为竞争、中断、状态标记,比如Listing3.4中的用法。当Volatile变量用于其他类型的状态信息时,就需要格外小心。例如,在不能保证递增操作由单线程执行的情况下,Volatile的语义学定义不足以保证递增操作的原子性。能够提供“read-modify-write”操作的原子性变量可原子变量可以被当做“better volatile variables”使用,见第十五章。
Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.
锁机制可以同时保证可见性和原子性,volatile变量只能够保证可见性。

You can use volatile variables only when all the following criteria are met:
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• The variable does not participate in invariants with other state variables; and
• Locking is not required for any other reason while the variable is being accessed.
只有在下列标准都遵循的情况下,才可以使用volatile类型的变量。
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• 对变量的修改并不依赖于变量的当前值,或者你能够保证只有一个线程可以修改该变量值。
• The variable does not participate in invariants with other state variables; and
• 该变量不会与其他状态一起参与不变性的维护。
• Locking is not required for any other reason while the variable is being accessed.
• 在变量被访问的时候,没有其他对锁机制的需求。

你可能感兴趣的:(java,jvm,多线程,thread,编程)