The "Double-Checked Locking is Broken" Declaration

Signed by: David Bacon (IBM Research)Joshua Bloch (Javasoft),Jeff Bogda,Cliff Click (Hotspot JVM project),Paul Haahr, Doug Lea,Tom May,Jan-Willem Maessen,Jeremy Manson,John D. Mitchell (jGuru)Kelvin Nilsen, Bill Pugh,Emin Gun Sirer

Double-Checked Locking is widely cited and usedas an efficient method for implementinglazy initializationin a multithreaded environment.

Unfortunately, it will not work reliably in a platform independent waywhen implemented in Java, without additional synchronization.When implemented in other languages, such as C++, it depends onthe memory model of the processor, the reorderings performed bythe compiler and the interaction between the compiler and the synchronizationlibrary. Since none of these are specified in a language such as C++, little can be said about the situations in which it will work. Explicitmemory barriers can be used to make it work in C++, but these barriers arenot available in Java.

To first explain the desired behavior, consider the following code:

// Single threaded version
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) 
        helper = new Helper();
    return helper;
    }
  // other functions and members...
  }

If this code was used in a multithreaded context,many things could go wrong. Most obviously, two or moreHelper objects could be allocated. (We'll bring up other problemslater).The fix to this is simply to synchronize the getHelper() method:

// Correct multithreaded version
class Foo { 
  private Helper helper = null;
  public synchronized Helper getHelper() {
    if (helper == null) 
        helper = new Helper();
    return helper;
    }
  // other functions and members...
  }

The code above performs synchronization every time getHelper() is called.The double-checked locking idiom tries to avoid synchronizationafter the helper is allocated:

// Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) 
      synchronized(this) {
        if (helper == null) 
          helper = new Helper();
      }    
    return helper;
    }
  // other functions and members...
  }

Unfortunately, that code justdoes not work in the presence of either optimizing compilersor shared memory multiprocessors.

It doesn't work

There are lots of reasons it doesn't work. The first couple ofreasons we'll describe are more obvious. After understanding those, you may be temptedto try to devise a way to "fix" the double-checked locking idiom. Your fixes will notwork: there are more subtle reasons why your fix won't work. Understand thosereasons, come up with a better fix, and it still won't work, because there are evenmore subtle reasons.

Lots of very smart people have spent lots of time looking at this. There is no way to make it work without requiring each thread thataccesses the helper object to perform synchronization.

The first reason it doesn't work

The most obvious reason it doesn't work itthat the writes that initialize the Helper objectand the write to the helper field can be done or perceived out of order. Thus, a thread which invokes getHelper() could see a non-null reference to a helper object, but see the default values for fields of the helper object, rather than the values set in the constructor.

If the compiler inlines the call to the constructor, then the writesthat initialize the object and the write to the helper fieldcan be freely reordered if the compiler can prove that the constructorcannot throw an exception or perform synchronization.

Even if the compiler does not reorder those writes, on a multiprocessorthe processor or the memory system may reorder those writes, as perceived bya thread running on anotherprocessor.

Doug Lea has writtena more detailed description of compiler-based reorderings.

A test case showing that it doesn't work

Paul Jakubik found an example of a use of double-checked lockingthat did not work correctly. A slightly cleaned up versionof that code is available here.

When run on a system using the Symantec JIT, it doesn't work.In particular, the Symantec JIT compiles

singletons[i].reference = new Singleton();

to the following (note that the Symantec JIT usinga handle-based object allocation system).

0206106A   mov         eax,0F97E78h
0206106F   call        01F6B210                  ; allocate space for
                                                 ; Singleton, return result in eax
02061074   mov         dword ptr [ebp],eax       ; EBP is &singletons[i].reference 
                                                ; store the unconstructed object here.
02061077   mov         ecx,dword ptr [eax]       ; dereference the handle to
                                                 ; get the raw pointer
02061079   mov         dword ptr [ecx],100h      ; Next 4 lines are
0206107F   mov         dword ptr [ecx+4],200h    ; Singleton's inlined constructor
02061086   mov         dword ptr [ecx+8],400h
0206108D   mov         dword ptr [ecx+0Ch],0F84030h

As you can see, the assignment to singletons[i].referenceis performed before the constructor for Singleton is called.This is completely legal under the existing Java memory model,and also legal in C and C++ (since neither of them have a memory model).

A fix that doesn't work

Given the explanation above, a number of people have suggested the following code:

// (Still) Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) {
      Helper h;
      synchronized(this) {
        h = helper;
        if (h == null) 
            synchronized (this) {
              h = new Helper();
            } // release inner synchronization lock
        helper = h;
        } 
      }    
    return helper;
    }
  // other functions and members...
  }

This code puts construction of the Helper object inside an inner synchronized block. The intuitive idea here is that there should be a memorybarrier at the point where synchronization is released, and that shouldprevent the reordering of the initialization of the Helper objectand the assignment to the field helper.

Unfortunately, that intuition is absolutely wrong. The rules forsynchronization don't work that way.The rule for a monitorexit (i.e., releasing synchronization) is thatactions before the monitorexit must be performed before the monitoris released. However, there is no rule which says that actions afterthe monitorexit may not be done before the monitor is released. It is perfectly reasonable and legal for the compiler to movethe assignment helper = h; inside the synchronized block,in which case we are back where we were previously. Many processorsoffer instructions that perform this kind of one-way memory barrier.Changing the semantics to require releasing a lock to be a full memorybarrier would have performance penalties.

More fixes that don't work

There is something you can do to force the writer to perform a full bidirectionalmemory barrier. This is gross, inefficient, and is almost guaranteed not to work once the Java Memory Model is revised. Do not use this. In the interests of science, I've put a description of this technique ona separate page. Do not use it.

However, even with a full memory barrier being performed bythe thread that initializes the helper object, it still doesn't work.

The problem is that on some systems, the threadwhich sees a non-null value for the helper field also needs to performmemory barriers.

Why? Because processors have their own locally cached copies of memory.On some processors, unless the processor performs a cache coherence instruction(e.g., a memory barrier), reads can be performed out of stale locally cachedcopies, even if other processors used memory barriers to force their writesinto global memory.

I've created a separate web page with a discussion of how this can actually happen on an Alpha processor.

Is it worth the trouble?

For most applications, the cost of simply making the getHelper()method synchronized is not high. You should only consider this kind ofdetailed optimizations if you know that it is causing a substantialoverhead for an application.

Very often, more high level cleverness, such as using the builtin mergesort rather than handling exchange sort (see the SPECJVM DB benchmark)will have much more impact.

Making it work for static singletons

If the singleton you are creating is static (i.e., there will onlybe one Helper created), as opposed to a property of anotherobject (e.g., there will be one Helper foreach Foo object,there is a simple and elegant solution.

Just define the singleton as a static field in a separate class.The semantics of Java guarantee that the field will notbe initialized until the field is referenced, and that any thread whichaccesses the field will see all of the writes resulting from initializing that field.

class HelperSingleton {
  static Helper singleton = new Helper();
  }

It will work for 32-bit primitive values

Although the double-checked locking idiom cannot be used for referencesto objects, it can work for 32-bit primitive values (e.g., int's or float's).Note that it does not work for long's or double's, since unsynchronized reads/writesof 64-bit primitives are not guaranteed to be atomic.

// Correct Double-Checked Locking for 32-bit primitives
class Foo { 
  private int cachedHashCode = 0;
  public int hashCode() {
    int h = cachedHashCode;
    if (h == 0) 
    synchronized(this) {
      if (cachedHashCode != 0) return cachedHashCode;
      h = computeHashCode();
      cachedHashCode = h;
      }
    return h;
    }
  // other functions and members...
  }

In fact, assuming that the computeHashCode function alwaysreturned the same result and had no side effects (i.e., idempotent),you could even get rid of all of the synchronization.

// Lazy initialization 32-bit primitives
// Thread-safe if computeHashCode is idempotent
class Foo { 
  private int cachedHashCode = 0;
  public int hashCode() {
    int h = cachedHashCode;
    if (h == 0) {
      h = computeHashCode();
      cachedHashCode = h;
      }
    return h;
    }
  // other functions and members...
  }

Making it work with explicit memory barriers

It is possible to make the double checked locking pattern workif you have explicit memory barrier instructions. For example, ifyou are programming in C++, you can use the codefrom Doug Schmidt et al.'s book:

// C++ implementation with explicit memory barriers
// Should work on any platform, including DEC Alphas
// From "Patterns for Concurrent and Distributed Objects",
// by Doug Schmidt
template <class TYPE, class LOCK> TYPE *
Singleton<TYPE, LOCK>::instance (void) {
    // First check
    TYPE* tmp = instance_;
    // Insert the CPU-specific memory barrier instruction
    // to synchronize the cache lines on multi-processor.
    asm ("memoryBarrier");
    if (tmp == 0) {
        // Ensure serialization (guard
        // constructor acquires lock_).
        Guard<LOCK> guard (lock_);
        // Double check.
        tmp = instance_;
        if (tmp == 0) {
                tmp = new TYPE;
                // Insert the CPU-specific memory barrier instruction
                // to synchronize the cache lines on multi-processor.
                asm ("memoryBarrier");
                instance_ = tmp;
        }
    return tmp;
    }

Fixing Double-Checked Locking using Thread Local Storage

Alexander Terekhov ([email protected]) came up clever suggestion for implementingdouble checked locking using thread local storage. Each threadkeeps a thread local flag to determine whetherthat thread has done the required synchronization.

  class Foo {
	 /** If perThreadInstance.get() returns a non-null value, this thread
		has done synchronization needed to see initialization
		of helper */
         private final ThreadLocal perThreadInstance = new ThreadLocal();
         private Helper helper = null;
         public Helper getHelper() {
             if (perThreadInstance.get() == null) createHelper();
             return helper;
         }
         private final void createHelper() {
             synchronized(this) {
                 if (helper == null)
                     helper = new Helper();
             }
	     // Any non-null value would do as the argument here
             perThreadInstance.set(perThreadInstance);
         }
	}

The performance of this technique depends quite a bit on which JDK implementationyou have. In Sun's 1.2 implementation, ThreadLocal's were very slow. They are significantly faster in 1.3, and are expected to be faster still in 1.4. Doug Lea analyzed the performance of some techniques forimplementing lazy initialization.

Under the new Java Memory Model

As of JDK5, there is a new Java Memory Model and Thread specification.

Fixing Double-Checked Locking using Volatile

JDK5 and later extends the semantics for volatile so that thesystem will not allow a write of a volatile to be reordered withrespect to any previous read or write, and a read of a volatile cannotbe reordered with respect to any following read or write. See this entry in Jeremy Manson's blog for more details.

With this change, the Double-Checked Locking idiom can be made towork by declaring the helper field to be volatile. This does notwork under JDK4 and earlier.

// Works with acquire/release semantics for volatile
// Broken under current semantics for volatile
  class Foo {
        private volatile Helper helper = null;
        public Helper getHelper() {
            if (helper == null) {
                synchronized(this) {
                    if (helper == null)
                        helper = new Helper();
                }
            }
            return helper;
        }
    }

Double-Checked Locking Immutable Objects

If Helper is an immutable object, such that all of the fields ofHelper are final, then double-checked lockingwill work without having to use volatile fields. The idea is that a reference to an immutable object (such as aString or an Integer) should behave in much the same way as an int or float;reading and writing references to immutable objects are atomic.

Descriptions of double-check idiom

  • Reality Check, Douglas C. Schmidt, C++ Report, SIGS, Vol. 8, No. 3, March 1996.
  • Double-Checked Locking:An Optimization Pattern for Efficiently Initializing and AccessingThread-safe Objects,Douglas Schmidt and Tim Harrison.3rd annual Pattern Languages of Program Design conference, 1996
  • Lazy instantiation, Philip Bishop and Nigel Warren, JavaWorld Magazine
  • Programming Java threads in the real world, Part 7, Allen Holub,Javaworld Magazine, April 1999.
  • Java 2 Performance and Idiom Guide, Craig Larman and Rhett Guthrie, p100.
  • Java in Practice: Design Styles and Idioms for Effective Java, NigelWarren and Philip Bishop, p142.
  • Rule 99, The Elements of Java Style, Allan Vermeulen, Scott Ambler, Greg Bumgardner, Eldon Metz, Trvor Misfeldt, Jim Shur, Patrick Thompson, SIGS Reference library
  • Global Variables in Java with the Singleton Pattern, Wiebe de Jong, Gamelan

你可能感兴趣的:(The "Double-Checked Locking is Broken" Declaration)