Chapter 4. Composing Objects 组合对象
So far, we've covered the low-level basics of thread safety and synchronization. But we don't want to have to analyze each memory access to ensure that our program is thread-safe; we want to be able to take thread-safe components and safely compose them into larger components or programs. This chapter covers patterns for structuring classes that can make it easier to make them thread-safe and to maintain them without accidentally undermining their safety guarantees.
简单而言,我们希望通过使用线程安全组件,并且安全的组合他们成为更大的组件或者程序,而不需要我们去通过分析每次内存访问来确定我们的程序是线程安全的。
4.1. Designing a Thread-safe Class 设计一个线程安全的类
While it is possible to write a thread-safe program that stores all its state in public static fields, it is a lot harder to verify its thread safety or to modify it so that it remains thread-safe than one that uses encapsulation appropriately. Encapsulation makes it possible to determine that a class is thread-safe without having to examine the entire program.
虽然写一个将自己所有状态保存在公开静态变量的线程安全的程序是可能的,比使用经过适当封装的程序更难得是验证其线程安全性或者修改它并保证其线程安全
The design process for a thread-safe class should include these three basic elements:
Identify the variables that form the object's state;
Identify the invariants that constrain the state variables;
Establish a policy for managing concurrent access to the object's state.
An object's state starts with its fields. If they are all of primitive type, the fields comprise the entire state. Counter in Listing 4.1 has only one field, so the value field comprises its entire state. The state of an object with n primitive fields is just the n-tuple of its field values; the state of a 2D Point is its (x, y) value. If the object has fields that are references to other objects, its state will encompass fields from the referenced objects as well. For example, the state of a LinkedList includes the state of all the link node objects belonging to the list.
The synchronization policy defines how an object coordinates access to its state without violating its invariants or postconditions. It specifies what combination of immutability, thread confinement, and locking is used to maintain thread safety, and which variables are guarded by which locks. To ensure that the class can be analyzed and maintained, document the synchronization policy.
Listing 4.1. Simple Thread-safe Counter Using the Java Monitor Pattern.
@ThreadSafe
public final class Counter {
@GuardedBy("this") private long value = 0;
public synchronized long getValue() {
return value;
}
public synchronized long increment() {
if (value == Long.MAX_VALUE)
throw new IllegalStateException("counter overflow");
return ++value;
}
}
4.1.1. Gathering Synchronization Requirements 收集同步需求
Making a class thread-safe means ensuring that its invariants hold under concurrent access; this requires reasoning about its state. Objects and variables have a state space: the range of possible states they can take on. The smaller this state space, the easier it is to reason about. By using final fields wherever practical, you make it simpler to analyze the possible states an object can be in. (In the extreme case, immutable objects can only be in a single state.)
使一个类的线程安全意味着确保其不变量的并发访问下举行,这需要有关其状态的推理。对象和变量状态空间:他们可以承担可能的状态的范围。这种状态空间越小,越容易的 推理是什么。通过使用final字段地方实际,你能更简单的分析对象可能的状态(在极端的情况下,一成不变的对象只能是在一个单一的状态。)
Many classes have invariants that identify certain states as valid or invalid. The value field in Counter is a long. The state space of a long ranges from Long.MIN_VALUE to Long.MAX_VALUE, but Counter places constraints on value; negative values are not allowed.
Similarly, operations may have postconditions that identify certain state transitions as invalid. If the current state of a Counter is 17, the only valid next state is 18. When the next state is derived from the current state, the operation is necessarily a compound action. Not all operations impose state transition constraints; when updating a variable that holds the current temperature, its previous state does not affect the computation.
相似的,操作可以有验证某个状态变化无效的后置条件。
Constraints placed on states or state transitions by invariants and postconditions create additional synchronization or encapsulation requirements. If certain states are invalid, then the underlying state variables must be encapsulated, otherwise client code could put the object into an invalid state. If an operation has invalid state transitions, it must be made atomic. On the other hand, if the class does not impose any such constraints, we may be able to relax encapsulation or serialization requirements to obtain greater flexibility or better performance.
状态或状态转换不变和后置条件显示产生了额外同步或封装需求。如果某些状态无效,那么基本状态变脸必须被封装,否则客户端代码可能导致对象编程一个无效状态。如果一个操作存在无效状态转换,那么它必须被作为原子操作。另一方面,如果某个类没有强加任何这些限制,我们可以放松封装或序列化要求来实现更高的灵活性和更好的性能。
A class can also have invariants that constrain multiple state variables. A number range class, like NumberRange in Listing 4.10, typically maintains state variables for the lower and upper bounds of the range. These variables must obey the constraint that the lower bound be less than or equal to the upper bound. Multivariable invariants like this one create atomicity requirements: related variables must be fetched or updated in a single atomic operation. You cannot update one, release and reacquire the lock, and then update the others, since this could involve leaving the object in an invalid state when the lock was released. When multiple variables participate in an invariant, the lock that guards them must be held for the duration of any operation that accesses the related variables.
You cannot ensure thread safety without understanding an object's invariants and postconditions. Constraints on the valid values or state transitions for state variables can create atomicity and encapsulation requirements.
4.1.2. State-dependent Operations
Class invariants and method postconditions constrain the valid states and state transitions for an object. Some objects also have methods with state-based preconditions. For example, you cannot remove an item from an empty queue; a queue must be in the "nonempty" state before you can remove an element. Operations with state-based preconditions are called state-dependent [CPJ 3].
类变量和方法后置条件为对象的有效状态和状态转换做限制。有些对象也有基于状态的先决条件的方法。例如,您不能删除一个空队列的项目;一个队列必须在“非空”状态之前,您可以删除一个元素。基于状态的操作被称为状态依赖。
In a single-threaded program, if a precondition does not hold, the operation has no choice but to fail. But in a concurrent program, the precondition may become true later due to the action of another thread. Concurrent programs add the possibility of waiting until the precondition becomes true, and then proceeding with the operation.
在单线程程序,如果一个前提条件不成立,该操作没有任何选择只能失败。但在并发程序,由于另一个线程的行动前提可能成立。并发程序添加的前提等待,直到变成真正的可能性,然后操作继续进行。
The built-in mechanisms for efficiently waiting for a condition to become truewait and notifyare tightly bound to intrinsic locking, and can be difficult to use correctly. To create operations that wait for a precondition to become true before proceeding, it is often easier to use existing library classes, such as blocking queues or semaphores, to provide the desired state-dependent behavior. Blocking library classes such as BlockingQueue, Semaphore, and other synchronizers are covered in Chapter 5; creating state-dependent classes using the low-level mechanisms provided by the platform and class library is covered in Chapter 14.
4.1.3. State Ownership
We implied in Section 4.1 that an object's state could be a subset of the fields in the object graph rooted at that object. Why might it be a subset? Under what conditions are fields reachable from a given object not part of that object's state?
When defining which variables form an object's state, we want to consider only the data that object owns. Ownership is not embodied explicitly in the language, but is instead an element of class design. If you allocate and populate a HashMap, you are creating multiple objects: the HashMap object, a number of Map.Entry objects used by the implementation of HashMap, and perhaps other internal objects as well. The logical state of a HashMap includes the state of all its Map.Entry and internal objects, even though they are implemented as separate objects.
For better or worse, garbage collection lets us avoid thinking carefully about ownership. When passing an object to a method in C++, you have to think fairly carefully about whether you are transferring ownership, engaging in a short-term loan, or envisioning long-term joint ownership. In Java, all these same ownership models are possible, but the garbage collector reduces the cost of many of the common errors in reference sharing, enabling less-than-precise thinking about ownership.
In many cases, ownership and encapsulation go togetherthe object encapsulates the state it owns and owns the state it encapsulates. It is the owner of a given state variable that gets to decide on the locking protocol used to maintain the integrity of that variable's state. Ownership implies control, but once you publish a reference to a mutable object, you no longer have exclusive control; at best, you might have "shared ownership". A class usually does not own the objects passed to its methods or constructors, unless the method is designed to explicitly transfer ownership of objects passed in (such as the synchronized collection wrapper factory methods).
Collection classes often exhibit a form of "split ownership", in which the collection owns the state of the collection infrastructure, but client code owns the objects stored in the collection. An example is ServletContext from the servlet framework. ServletContext provides a Map-like object container service to servlets where they can register and retrieve application objects by name with setAttribute and getAttribute. The ServletContext object implemented by the servlet container must be thread-safe, because it will necessarily be accessed by multiple threads. Servlets need not use synchronization when calling set-Attribute and getAttribute, but they may have to use synchronization when using the objects stored in the ServletContext. These objects are owned by the application; they are being stored for safekeeping by the servlet container on the application's behalf. Like all shared objects, they must be shared safely; in order to prevent interference from multiple threads accessing the same object concurrently, they should either be thread-safe, effectively immutable, or explicitly guarded by a lock.[1]
[1] Interestingly, the HttpSession object, which performs a similar function in the servlet framework, may have stricter requirements. Because the servlet container may access the objects in the HttpSession so they can be serialized for replication or passivation, they must be thread-safe because the container will be accessing them as well as the web application. (We say "may have" since replication and passivation is outside of the servlet specification but is a common feature of servlet containers.)
4.2. Instance Confinement实例约束
If an object is not thread-safe, several techniques can still let it be used safely in a multithreaded program. You can ensure that it is only accessed from a single thread (thread confinement), or that all access to it is properly guarded by a lock.
多线程程序中,可以通过保证同时只有一个线程访问某个非线程安全对象或者使用对象锁的机制来保护非线程安全对象。
Encapsulation simplifies making classes thread-safe by promoting instance confinement, often just called confinement [CPJ 2.3.3]. When an object is encapsulated within another object, all code paths that have access to the encapsulated object are known and can be therefore be analyzed more easily than if that object were accessible to the entire program. Combining confinement with an appropriate locking discipline can ensure that otherwise non-thread-safe objects are used in a thread-safe manner.
Encapsulating data within an object confines access to the data to the object's methods, making it easier to ensure that the data is always accessed with the appropriate lock held.
将数据封装到一个对象约束对数据进行访问要通过这个对象,使其更容易地确保数据始终以适当持有的锁访问。
Confined objects must not escape their intended scope. An object may be confined to a class instance (such as a private class member), a lexical scope (such as a local variable), or a thread (such as an object that is passed from method to method within a thread, but not supposed to be shared across threads). Objects don't escape on their own, of coursethey need help from the developer, who assists by publishing the object beyond its intended scope.
简言之,约束对象不要从他们的预期作用域逃逸,需要开发者将他们在预期的作用域公开。即这需要开发者有较高的水平。
PersonSet in Listing 4.2 illustrates how confinement and locking can work together to make a class thread-safe even when its component state variables are not. The state of PersonSet is managed by a HashSet, which is not thread-safe. But because mySet is private and not allowed to escape, the HashSet is confined to the PersonSet. The only code paths that can access mySet are addPerson and containsPerson, and each of these acquires the lock on the PersonSet. All its state is guarded by its intrinsic lock, making PersonSet thread-safe.
Listing 4.2. Using Confinement to Ensure Thread Safety.
@ThreadSafe
public class PersonSet {
@GuardedBy("this")
private final Set<Person> mySet = new HashSet<Person>();
public synchronized void addPerson(Person p) {
mySet.add(p);
}
public synchronized boolean containsPerson(Person p) {
return mySet.contains(p);
}
}
This example makes no assumptions about the thread-safety of Person, but if it is mutable, additional synchronization will be needed when accessing a Person retrieved from a PersonSet. The most reliable way to do this would be to make Person tHRead-safe; less reliable would be to guard the Person objects with a lock and ensure that all clients follow the protocol of acquiring the appropriate lock before accessing the Person.
Instance confinement is one of the easiest ways to build thread-safe classes. It also allows flexibility in the choice of locking strategy; PersonSet happened to use its own intrinsic lock to guard its state, but any lock, consistently used, would do just as well. Instance confinement also allows different state variables to be guarded by different locks. (For an example of a class that uses multiple lock objects to guard its state, see ServerStatus on 236.)
There are many examples of confinement in the platform class libraries, including some classes that exist solely to turn non-thread-safe classes into threadsafe ones.
The basic collection classes such as ArrayList and HashMap are not thread-safe, but the class library provides wrapper factory methods (Collections.synchronizedList and friends) so they can be used safely in multithreaded environments. These factories use the Decorator pattern (Gamma et al., 1995) to wrap the collection with a synchronized wrapper object; the wrapper implements each method of the appropriate interface as a synchronized method that forwards the request to the underlying collection object. So long as the wrapper object holds the only reachable reference to the underlying collection (i.e., the underlying collection is confined to the wrapper), the wrapper object is then thread-safe. The Javadoc for these methods warns that all access to the underlying collection must be made through the wrapper.
Of course, it is still possible to violate confinement by publishing a supposedly confined object; if an object is intended to be confined to a specific scope, then letting it escape from that scope is a bug. Confined objects can also escape by publishing other objects such as iterators or inner class instances that may indirectly publish the confined objects.
当然,通过公开被约束了的对象来扰乱对象约束。如果准备将一个对象约束在一个特定范围,那么尝试使其从那个范围域逃逸是一个程序BUG。通不过公开其他对象可以使被约束的对象逃逸,比如迭代器或者内部类对象实例可能间接地公开被约束隔离的对象,
Confinement makes it easier to build thread-safe classes because a class that confines its state can be analyzed for thread safety without having to examine the whole program.
4.2.1. The Java Monitor Pattern监听者模式
Following the principle of instance confinement to its logical conclusion leads you to the Java monitor pattern.[2] An object following the Java monitor pattern encapsulates all its mutable state and guards it with the object's own intrinsic lock.
[2] The Java monitor pattern is inspired by Hoare's work on monitors (Hoare, 1974), though there are significant differences between this pattern and a true monitor. The bytecode instructions for entering and exiting a synchronized block are even called monitorenter and monitorexit, and Java's built-in (intrinsic) locks are sometimes called monitor locks or monitors.
Counter in Listing 4.1 shows a typical example of this pattern. It encapsulates one state variable, value, and all access to that state variable is through the methods of Counter, which are all synchronized.
The Java monitor pattern is used by many library classes, such as Vector and Hashtable. Sometimes a more sophisticated synchronization policy is needed; Chapter 11 shows how to improve scalability through finer-grained locking strategies. The primary advantage of the Java monitor pattern is its simplicity.
The Java monitor pattern is merely a convention; any lock object could be used to guard an object's state so long as it is used consistently. Listing 4.3 illustrates a class that uses a private lock to guard its state.
Listing 4.3. Guarding State with a Private Lock.
public class PrivateLock {
private final Object myLock = new Object();
@GuardedBy("myLock") Widget widget;
void someMethod() {
synchronized(myLock) {
// Access or modify the state of widget
}
}
}
There are advantages to using a private lock object instead of an object's intrinsic lock (or any other publicly accessible lock). Making the lock object private encapsulates the lock so that client code cannot acquire it, whereas a publicly accessible lock allows client code to participate in its synchronization policycorrectly or incorrectly. Clients that improperly acquire another object's lock could cause liveness problems, and verifying that a publicly accessible lock is properly used requires examining the entire program rather than a single class.
4.2.2. Example: Tracking Fleet Vehicles
Counter in Listing 4.1 is a concise, but trivial, example of the Java monitor pattern. Let's build a slightly less trivial example: a "vehicle tracker" for dispatching fleet vehicles such as taxicabs, police cars, or delivery trucks. We'll build it first using the monitor pattern, and then see how to relax some of the encapsulation requirements while retaining thread safety.
Each vehicle is identified by a String and has a location represented by (x, y) coordinates. The VehicleTracker classes encapsulate the identity and locations of the known vehicles, making them well-suited as a data model in a modelview-controller GUI application where it might be shared by a view thread and multiple updater threads. The view thread would fetch the names and locations of the vehicles and render them on a display:
Map<String, Point> locations = vehicles.getLocations();
for (String key : locations.keySet())
renderVehicle(key, locations.get(key));
Similarly, the updater threads would modify vehicle locations with data received from GPS devices or entered manually by a dispatcher through a GUI interface:
void vehicleMoved(VehicleMovedEvent evt) {
Point loc = evt.getNewLocation();
vehicles.setLocation(evt.getVehicleId(), loc.x, loc.y);
}
Since the view thread and the updater threads will access the data model concurrently, it must be thread-safe. Listing 4.4 shows an implementation of the vehicle tracker using the Java monitor pattern that uses MutablePoint in Listing 4.5 for representing the vehicle locations.
Even though MutablePoint is not thread-safe, the tracker class is. Neither the map nor any of the mutable points it contains is ever published. When we need to a return vehicle locations to callers, the appropriate values are copied using either the MutablePoint copy constructor or deepCopy, which creates a new Map whose values are copies of the keys and values from the old Map.[3]
[3] Note that deepCopy can't just wrap the Map with an unmodifiableMap, because that protects only the collection from modification; it does not prevent callers from modifying the mutable objects stored in it. For the same reason, populating the HashMap in deepCopy via a copy constructor wouldn't work either, because only the references to the points would be copied, not the point objects themselves.
This implementation maintains thread safety in part by copying mutable data before returning it to the client. This is usually not a performance issue, but could become one if the set of vehicles is very large.[4] Another consequence of copying the data on each call to getLocation is that the contents of the returned collection do not change even if the underlying locations change. Whether this is good or bad depends on your requirements. It could be a benefit if there are internal consistency requirements on the location set, in which case returning a consistent snapshot is critical, or a drawback if callers require up-to-date information for each vehicle and therefore need to refresh their snapshot more often.
[4] BecausedeepCopy is called from a synchronized method, the tracker's intrinsic lock is held for the duration of what might be a long-running copy operation, and this could degrade the responsiveness of the user interface when many vehicles are being tracked.