2.2. Atomicity(原子性)

2.2. Atomicity(原子性)
What happens when we add one element of state to what was a stateless object? Suppose we want to add a "hit counter" that measures the number of requests processed. The obvious approach is to add a long field to the servlet and increment it on each request, as shown in UnsafeCountingFactorizer in Listing 2.2.
如果我们想要给一个无状态的对象添加一个状态将会发生什么样的事情呢?假设,我们想要加入一个“访问计数”来计算有多少个请求已经被处理过了。最容易被想到的方式就是给servlet增加一个长整型的域,然后在每个请求增加1,这种方式将在UnsafeCountingFactorizer类中(Listing 2.2);
Listing 2.2. Servlet that Counts Requests without the Necessary Synchronization. Don't Do this.(在没有必要的线程机制的情况下,servlet计算请求个数。不要模仿!)

@NotThreadSafe
public class UnsafeCountingFactorizer implements Servlet {
    private long count = 0;

    public long getCount() { return count; }

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        ++count;
        encodeIntoResponse(resp, factors);
    }

}

Unfortunately, UnsafeCountingFactorizer is not thread-safe, even though it would work just fine in a single-threaded environment. Just like UnsafeSequence on page 6, it is susceptible to lost updates. While the increment operation, ++count, may look like a single action because of its compact syntax, it is not atomic, which means that it does not execute as a single, indivisible operation. Instead, it is a shorthand for a sequence of three discrete operations: fetch the current value, add one to it, and write the new value back. This is an example of a read-modify-write operation, in which the resulting state is derived from the previous state.
不幸的是,UnsafeCountingFactorizer并不是一个线程安全的类,尽管这个servlet可以在一个单线程环境中很好的运行。就像第六页中的UnsafeSequence一样,这样的类不能放入多线程环境中。当执行++操作的时候,由于操作符非常简短的关系,表面上这个操作是单个操作,只能被单独的,不可见的执行,但事实上,这一次执行包含了三个独立的操作:取得当前的值,增加1,写回新值。 这是一个读、修改、写回的典型操作,在这种操作中,结果状态依赖于前一个状态。
Figure 1.1 on page 6 shows what can happen if two threads try to increment a counter simultaneously without synchronization. If the counter is initially 9, with some unlucky timing each thread could read the value, see that it is 9, add one to it, and each set the counter to 10. This is clearly not what is supposed to happen; an increment got lost along the way, and the hit counter is now permanently off by one.
第六页中的Figure 1.1展现了在没有同步机制的情况下如果两个线程同时执行计数操作会发生什么样子的事情。如果数值被初始化成9,那么在一些不走远的情况下,每一个线程都有可能会读到这个值,都会加1,这样每一个都会读到10这个数值。这种情形很明显是不允许发生的。使用这样的方法,有一些增加可能会被丢掉,“点击计数”也是不准确的。
You might think that having a slightly inaccurate count of hits in a web-based service is an acceptable loss of accuracy, and sometimes it is. But if the counter is being used to generate sequences or unique object identifiers, returning the same value from multiple invocations could cause serious data integrity problems.[3] The possibility of incorrect results in the presence of unlucky timing is so important in concurrent programming that it has a name: a race condition.
[3] The approach taken by UnsafeSequence and UnsafeCountingFactorizer has other serious problems, including the possibility of stale data (Section 3.1.1).
你可能会认为在基于web的服务中,对“点击计数”的统计有一些精确性的损失是可以接受的,有一些情况下的确是这样。但是,如果这个计数是用于产生序列值或者唯一的对象标识,在这种情况下,从多次应用中返回相同的值可能会引起严重的数据不一致的问题。这种出错的可能性导致了“霉运陷阱”的存在,这种典型场景在并发编程中是非常重要的,通常会被命名为“条件竞争”。
2.2.1. Race Conditions(条件竞争)
UnsafeCountingFactorizer has several race conditions that make its results unreliable. A race condition occurs when the correctness of a computation depends on the relative timing or interleaving of multiple threads by the runtime; in other words, when getting the right answer relies on lucky timing. [4] The most common type of race condition is check-then-act, where a potentially stale observation is used to make a decision on what to do next.
UnsafeCountingFactorizer正式由于含有几处“条件竞争”才导致其结果是不可信的。当计算的正确性依赖于由运行时环境决定的多个线程的交互时机的时候,“条件竞争”就发生了。换句话说,“条件竞争”就是是否能得到正确结果取决于你的运气。最为常见的“条件竞争”就是“check-then-act”时序,这种时序中,一个潜在的、失效的观察者决定着事件的发生。
[4] The term race condition is often confused with the related term data race, which arises when synchronization is not used to coordinate all access to a shared non final field. You risk a data race whenever a thread writes a variable that might next be read by another thread or reads a variable that might have last been written by another thread if both threads do not use synchronization; code with data races has no useful defined semantics under the Java Memory Model. Not all race conditions are data races, and not all data races are race conditions, but they both can cause concurrent programs to fail in unpredictable ways. UnsafeCountingFactorizer has both race conditions and data races. See Chapter 16 for more on data races.
“条件竞争”这个概念有时候会与“数据竞争”弄混淆,“数据竞争”是指在当同步机制并没有被全面用于共享的非final域时发生的错误。在这种情况下,你面临这你刚刚写入一个变量就被别的线程读取到,或者你读取了一个被别的线程改变的变量的风险。存在“数据竞争”的程序,在java的内存模型下,没有一个可用的语义学定义。并不是所有的“条件竞争”都是“数据竞争”,也并不是所有的“数据竞争”都是“条件竞争”,但是他们都可能导致并发程序遇到不可预测的灾难性后果。
UnsafeCountingFactorizer类中同时存在“条件竞争”和“数据竞争”,在第十六章中,会有更多的细节。
We often encounter race conditions in real life. Let's say you planned to meet a friend at noon at the Starbucks on University Avenue. But when you get there, you realize there are two Starbucks on University Avenue, and you're not sure which one you agreed to meet at. At 12:10, you don't see your friend at Starbucks A, so you walk over to Starbucks B to see if he's there, but he isn't there either. There are a few possibilities: your friend is late and not at either Starbucks; your friend arrived at Starbucks A after you left; or your friend was at Starbucks B, but went to look for you, and is now en route to Starbucks A. Let's assume the worst and say it was the last possibility. Now it's 12:15, you've both been to both Starbucks, and you're both wondering if you've been stood up. What do you do now? Go back to the other Starbucks? How many times are you going to go back and forth? Unless you have agreed on a protocol, you could both spend the day walking up and down University Avenue, frustrated and under caffeinated.
在现实生活中,我们经常会遇到“条件竞争”。例如,你可以会跟一个朋友约好在某个中午一起去大学的星巴克里喝咖啡。但是当你到了的时候,你会发现大学里面居然有两个星巴克。在12:10的时候,你并没有在星巴克A中发现你的朋友,于是你会去星巴克B中寻找他,但是他同样不在。这种情况的发生有以下几种可能:你的朋友迟到了,他没有在任何一个星巴克里。你的朋友在你离开星巴克A之后进入星巴克A。你的朋友在星巴克B中,然后他去星巴克A中找你。让我们设想一种最坏的可能性:现在是12:15,你们两个都已经在某一个星巴克里面了,你们都会考虑去哪个星巴克落脚。你现在怎么办?去另外一个星巴克?你准备多长时间往返一次?除非你们达成协议,否则你一整天都会往返于大学的林荫道里,失望而且身心疲惫。
The problem with the "I'll just nip up the street and see if he's at the other one" approach is that while you're walking up the street, your friend might have moved. You look around Starbucks A, observe "he's not here", and go looking for him. And you can do the same for Starbucks B, but not at the same time. It takes a few minutes to walk up the street, and during those few minutes, the state of the system may have changed.
“我穿越街道,看看你是不是在另外一个咖啡馆里”,这个方法的问题是,当你走在路上的时候,你的朋友可能已经离开了。你查看了星巴克A,发现朋友不在,然后去找他,当你查看星巴克B的时候,同样没有找到。这是因为你穿越街道需要几分钟,就在这几分钟过程中,系统的状态发生了改变。
The Starbucks example illustrates a race condition because reaching the desired outcome (meeting your friend) depends on the relative timing of events (when each of you arrives at one Starbucks or the other, how long you wait there before switching, etc). The observation that he is not at Starbucks A becomes potentially invalid as soon as you walk out the front door; he could have come in through the back door and you wouldn't know. It is this invalidation of observations that characterizes most race conditions using a potentially stale observation to make a decision or perform a computation. This type of race condition is called check-then-act: you observe something to be true (file X doesn't exist) and then take action based on that observation (create X); but in fact the observation could have become invalid between the time you observed it and the time you acted on it (someone else created X in the meantime), causing a problem (unexpected exception, overwritten data, file corruption).
这个星巴克的例子是对“条件竞争”的一个很好的形象化说明。在这个例子中,得到想要的输出需要依赖于相关事件的发生时序(你到达的咖啡馆的时间、你在咖啡馆里的等待时间,等等)。“他不在星巴克里”这一结论可能在你刚刚走出星巴克前门的时候就已经错了,因为你的朋友可能在这个时候从后门进入星巴克,而你却不知道。这种不正确的结论就是“条件竞争”的典型特征,使用“可能失效”的结论去做一个决定或者执行一次运算。这种类型的条件竞争被称之为“check-then-act”:你得到了某个正确的结论(文件X不存在),然后你基于这个结论采取了行动,但实际上在你得到结论和采取行动的间隙,这个结论已经发生了错了(别人创建了文件X)这就引起错误了(出现异常、覆盖数据、文件损坏)。
2.2.2. Example: Race Conditions in Lazy Initialization(在延迟初始化中的条件竞争)
A common idiom that uses check-then-act is lazy initialization. The goal of lazy initialization is to defer initializing an object until it is actually needed while at the same time ensuring that it is initialized only once. LazyInitRace in Listing 2.3 illustrates the lazy initialization idiom. The getInstance method first checks whether the ExpensiveObject has already been initialized, in which case it returns the existing instance; otherwise it creates a new instance and returns it after retaining a reference to it so that future invocations can avoid the more expensive code path.
延时加载经常要用到“check-then-act”。延迟加载的目标就是在一个对象真正被需要的时候再去初始化这个对象,并且同时要保证该对象只能被初始化一次。List2.3中LazyInitRace展示了延迟加载这个概念。getInstance方法首先会检查ExpensiveObject这个对象是否已经被初始化,如果已经被初始化则返回实例,否则将会创建一个新的实例,然后保存一个应用,这样就可以保证别的调用不会再次初始化这个对象。
Listing 2.3. Race Condition in Lazy Initialization. Don't Do this.

@NotThreadSafe
public class LazyInitRace {
    private ExpensiveObject instance = null;

    public ExpensiveObject getInstance() {
        if (instance == null)
            instance = new ExpensiveObject();
        return instance;
    }
}

LazyInitRace has race conditions that can undermine its correctness. Say that threads A and B execute getInstance at the same time. A sees that instance is null, and instantiates a new ExpensiveObject. B also checks if instance is null. Whether instance is null at this point depends unpredictably on timing, including the vagaries of scheduling and how long A takes to instantiate the ExpensiveObject and set the instance field. If instance is null when B examines it, the two callers to getInstance may receive two different results, even though getInstance is always supposed to return the same instance.
LazyInitRace中含有“条件竞争”,这有可能会引发不安全因素。假设线程A和线程B同时执行getInstance方法。A发现实例是空的,这样A就会去实例化一个新的ExpensiveObject。B也有可能会去检查instance是否为空。一个对象是否为空取决有不可预测的时序包括难以捕捉的调度和A需要多长时间可以初始化一个ExpensiveObject并且把域赋值。如果实例在B检查它的时候是空的话,那么两个调用者会得到两个不同的结果,这就违反了getInstance必须返回相同对象的限制。
The hit-counting operation in UnsafeCountingFactorizer has another sort of race condition. Read-modify-write operations, like incrementing a counter, define a transformation of an object's state in terms of its previous state. To increment a counter, you have to know its previous value and make sure no one else changes or uses that value while you are in mid-update.
UnsafeCountingFactorizer类中的“点击计数”里面存在另外一种“条件竞争”。“Read-modify-write”操作。比如,基于前一个状态定义一个对象状态的转变增加一个计数。为了增加该计数,你应该保证在你操作的期间不会有别的线程去使用和改动该计数。
Like most concurrency errors, race conditions don't always result in failure: some unlucky timing is also required. But race conditions can cause serious problems. If LazyInitRace is used to instantiate an application-wide registry, having it return different instances from multiple invocations could cause registrations to be lost or multiple activities to have inconsistent views of the set of registered objects. If UnsafeSequence is used to generate entity identifiers in a persistence framework, two distinct objects could end up with the same ID, violating identity integrity constraints.
如同大多数同步错误一样,“条件竞争”并不是每次都会导致错误出现,只有在“霉运陷阱”中,才会出现这样的错误。但是“条件竞争”会导致严重的问题,如果LazyInitRace这个类被用于初始化一个全局范围的记录簿,那么就会到处记录丢失,或者很多活动会获取到关于已经被记录事件的错误视图。如果UnsafeSequence被用来产生持久化框架的实例标示,那就有可能会出现两个对象使用同一个ID的问题,这违反了标示一致性约束。
2.2.3. Compound Actions(复合行为)
Both LazyInitRace and UnsafeCountingFactorizer contained a sequence of operations that needed to be atomic, or indivisible, relative to other operations on the same state. To avoid race conditions, there must be a way to prevent other threads from using a variable while we're in the middle of modifying it, so we can ensure that other threads can observe or modify the state only before we start or after we finish, but not in the middle.
LazyInitRace和UnsafeCountingFactorizer都有一系列的操作需要被变成原子化的,或者是不可见的,这与对该状态的其他操作有关。为了避免“条件竞争”,必须要有防止其他线程改变我们正在使用的变量的途径,我们必须确保其他线程可以在我们开始之前和结束之后看到和修改状态,而不是在我们使用途中。
Operations A and B are atomic with respect to each other if, from the perspective of a thread executing A, when another thread executes B, either all of B has executed or none of it has. An atomic operation is one that is atomic with respect to all operations, including itself, that operate on the same state.
操作A和操作B是原子性的这句话是相对于彼此而言的,如果一个线程想要执行A,而另外一个线程想要执行B,那么要么B被执行完毕,要么A和B都没有被执行,否则别的线程就不会得到执行的机会。原子操作是指对所有操作都维持一个在相同的状态(包括自身)

If the increment operation in UnsafeSequence were atomic, the race condition illustrated in Figure 1.1 on page 6 could not occur, and each execution of the increment operation would have the desired effect of incrementing the counter by exactly one. To ensure thread safety, check-then-act operations (like lazy initialization) and read-modify-write operations (like increment) must always be atomic. We refer collectively to check-then-act and read-modify-write sequences as compound actions: sequences of operations that must be executed atomically in order to remain thread-safe. In the next section, we'll consider locking, Java's builtin mechanism for ensuring atomicity. For now, we're going to fix the problem another way, by using an existing thread-safe class, as shown in CountingFactorizer in Listing 2.4.
如果UnsafeSequence中的递增操作是原子的,那么Figure1.1中所展示的“条件竞争”就不会发生。每一次的递增操作都会精确的增加1,并能够得到想要的结果。为了保证线程的安全,“check-then-act”操作(比如延迟初始化),“read-modify-write”操作(比如递增)都必须是原子化的。我们通常会把“check-then-act”和“read-modify-write”这样的操作称之为复合操作,为了保证线程安全,操作时序必须被设置成“原子化”。在下一节中,我们将会介绍到锁机制,这是一种java自带的保证原子性的机制。现在,我们准备使用另外一种方法修正该问题,我们将会使用一个线程安全的类。
Listing 2.4. Servlet that Counts Requests Using AtomicLong.
@ThreadSafe
public class CountingFactorizer implements Servlet {
    private final AtomicLong count = new AtomicLong(0);

    public long getCount() { return count.get(); }

    public void service(ServletRequest req, ServletResponse resp) {
        BigInteger i = extractFromRequest(req);
        BigInteger[] factors = factor(i);
        count.incrementAndGet();
        encodeIntoResponse(resp, factors);
    }
}

The java.util.concurrent.atomic package contains atomic variable classes for effecting atomic state transitions on numbers and object references. By replacing the long counter with an AtomicLong, we ensure that all actions that access the counter state are atomic. [5] Because the state of the servlet is the state of the counter and the counter is thread-safe, our servlet is once again thread-safe.
java.util.concurrent.atomic包中,包含了一些原子性的变量类,这些类可以实现关于数值和对象引用的高效率的原子状态迁移。通过把long型的计数器替换成AtomicLong类型我们就可以确保所有访问计数器状态的操作都是原子性的。因为计数器的状态也就是我们的servlet的状态,因此我们的servlet现在又是线程安全的了。
[5] CountingFactorizer calls incrementAndGet to increment the counter, which also returns the incremented value; in this case the return value is ignored.
CountingFactorizer调用incrementAndGet来递增计数器,这个方法会同时返回一个递增后的值,但在本例子中该值被忽略。
We were able to add a counter to our factoring servlet and maintain thread safety by using an existing thread-safe class to manage the counter state, AtomicLong. When a single element of state is added to a stateless class, the resulting class will be thread-safe if the state is entirely managed by a thread-safe object. But, as we'll see in the next section, going from one state variable to more than one is not necessarily as simple as going from zero to one.
Where practical, use existing thread-safe objects, like AtomicLong, to manage your class's state. It is simpler to reason about the possible states and state transitions for existing thread-safe objects than it is for arbitrary state variables, and this makes it easier to maintain and verify thread safety.

我们已经为我们的servlet增加了一个计数器,并且通过一个已有的线程安全的类来管理计数器的状态,这样我们就保证了线程的安全性。当一个状态元素被加入一个无状态的类的饿时候,产生的新类可能是线程安全的,如果这个新加入的状态是被一个线程安全的对象管理的话。但是,正如同我们将在下一章中看到的那样,从一个状态变量变成多个并不像从0加到1那么容易。

你可能感兴趣的:(多线程,thread,应用服务器,servlet,UP)