Java ThreadLocal

Java-ThreadLocal

参考

  • A Painless Introduction to Java's ThreadLocal Storage
  • 理解Java中的ThreadLocal
  • 正确理解ThreadLocal

用途

误区

看到很多资料上都有一些误区: 即使用ThreadLocal是用于解决对象共享访问问题, 线程安全问题等. 其实不然. 另外也不存在什么对对象的拷贝, 因为实际上和线程相关的参数实际上就存储在了Thread对象中的ThreadLocalMap threadLocals里面.

正确理解

最先接触到Thread Local这个概念是使用Python的Flask框架. 该框架中有一个对象g. 文档: flask.g.
该对象可以直接用from flask import g导入. 然后可以在需要存一些需要在多个地方使用的数据时, 可以g.set(), 然后需要获取值的时候可以直接g.get(). 而比较神奇的是在多线程环境下, 每个使用到g的地方都是直接这样引用的, 但是不同线程间的数据却不会相互覆盖. 其实g对象的实现就是使用了Thread Local.

所以个人理解, ThreadLocal其实主要是为了方便提供一些可能多个线程都需要访问的数据, 但是每个线程需要独享一个这样的对象. 如果用传统的全局变量, 每个线程虽然都能访问到, 但是会发生数据覆盖的问题, 而使用Thread Local, 则可以很方便地在不传递过多参数的情况下实现一个线程对应一个对象实例. 即这个数据需要对很多线程可见(global), 但每个线程其实都拥有一个独享的该数据对象(local).

如果不使用ThreadLocal想要实现类似的功能, 其实用一个全局静态Map就可以做到. 不过ThreadLocal就是为了简化这个操作, 而且效率高, 所以直接使用ThreadLocal即可.

一个应用场景(类似flask.g对象):

  • 每个请求由一个线程处理
  • 在请求处理过程中, 有多个地方需要用到某个数据 (比如说before_request, request_handling, post_request这几个地方)

一个看起来可行的方法是直接在请求处理代码中设置一个全局变量, 但是这样不同线程就会读到/修改同一个全局变量. 这时候使用ThreadLocal就可以很好地避免这个问题, 而不用我们自己去维护一个跟线程有关的Map来根据不同的线程获取对应的数据.

ThreadLocal例子

TransactionManager类:
注意threadLocal是一个静态的ThreadLocal变量. 意味着全部的线程访问的都是同一个ThreadLocal对象.

package multithreading.threadlocal;

/**
 * Created by xiaofu on 17-11-15.
 * https://dzone.com/articles/painless-introduction-javas-threadlocal-storage
 */
public class TransactionManager {

    private static ThreadLocal threadLocal = new ThreadLocal<>();

    public static void newTransaction(){
        // 生成一个新的transaction id
        String id = "" + System.currentTimeMillis();
        threadLocal.set(id);
    }

    public static void endTransaction(){
        // 避免出现内存泄露问题
        threadLocal.remove();
    }

    public static String getTransactionID(){
        return threadLocal.get();
    }


}

ThreadLocalTest类:

package multithreading.threadlocal;

/**
 * Created by xiaofu on 17-11-15.
 * https://dzone.com/articles/painless-introduction-javas-threadlocal-storage
 */
public class ThreadLocalTest {

    public static class Task implements Runnable{

        private String name;

        public Task(String name){this.name = name;}

        @Override
        public void run() {
            TransactionManager.newTransaction();
            System.out.printf("Task %s transaction id: %s\n", name, TransactionManager.getTransactionID());
            TransactionManager.endTransaction();
        }
    }

    public static void main(String[] args) throws InterruptedException {

        // 在main线程先操作一下TransactionManager
        TransactionManager.newTransaction();
        System.out.println("Main transaction id: " + TransactionManager.getTransactionID());

        String taskName = "[Begin a new transaction]";
        Thread thread = new Thread(new Task(taskName));
        thread.start();
        thread.join();

        System.out.println(String.format("Task %s is done", taskName));
        System.out.println("Main transaction id: " + TransactionManager.getTransactionID());
        TransactionManager.endTransaction();

    }

}

测试结果:
重点在于在main线程调用getTransactionID()的返回值并没有因为期间有一另个Thread设置了TransactionManger中的ThreadLocal变量的值而改变.

Main transaction id: 1510730858223
Task [Begin a new transaction] transaction id: 1510730858224
Task [Begin a new transaction] is done
Main transaction id: 1510730858223

可以看出不同线程对于同一个ThreadLocal变量的操作是不会有互相影响的. 因为该ThreadLocal变量对于所有线程都是全局的, 但是其存储的数据却是和线程相关的.

原理

ThreadLoccal类的set方法:

    /**
     * Sets the current thread's copy of this thread-local variable
     * to the specified value.  Most subclasses will have no need to
     * override this method, relying solely on the {@link #initialValue}
     * method to set the values of thread-locals.
     *
     * @param value the value to be stored in the current thread's copy of
     *        this thread-local.
     */
    public void set(T value) {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }

getMap()方法:

    /**
     * Get the map associated with a ThreadLocal. Overridden in
     * InheritableThreadLocal.
     *
     * @param  t the current thread
     * @return the map
     */
    ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }

从上面代码可以看出Thread类是有一个叫threadLocals的成员的.

public
class Thread implements Runnable{
    // ... 省略 ...
    /* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
    ThreadLocal.ThreadLocalMap threadLocals = null;
    // ... 省略 ...
}

ThreadLocalMapThreadLocal的一个静态内部类:
以下仅摘了了用于理解ThreadLocal原理的代码:

/**
     * ThreadLocalMap is a customized hash map suitable only for
     * maintaining thread local values. No operations are exported
     * outside of the ThreadLocal class. The class is package private to
     * allow declaration of fields in class Thread.  To help deal with
     * very large and long-lived usages, the hash table entries use
     * WeakReferences for keys. However, since reference queues are not
     * used, stale entries are guaranteed to be removed only when
     * the table starts running out of space.
     */
    static class ThreadLocalMap{
    
        /**
         * The entries in this hash map extend WeakReference, using
         * its main ref field as the key (which is always a
         * ThreadLocal object).  Note that null keys (i.e. entry.get()
         * == null) mean that the key is no longer referenced, so the
         * entry can be expunged from table.  Such entries are referred to
         * as "stale entries" in the code that follows.
         */
        static class Entry extends WeakReference> {
            /** The value associated with this ThreadLocal. */
            Object value;

            Entry(ThreadLocal k, Object v) {
                super(k);
                value = v;
            }
        }
        
        /**
         * The table, resized as necessary.
         * table.length MUST always be a power of two.
         */
        private Entry[] table;
        

    
        /**
         * Set the value associated with key.
         *
         * @param key the thread local object
         * @param value the value to be set
         */
        private void set(ThreadLocal key, Object value) {

            // We don't use a fast path as with get() because it is at
            // least as common to use set() to create new entries as
            // it is to replace existing ones, in which case, a fast
            // path would fail more often than not.

            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);

            for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
                ThreadLocal k = e.get();

                if (k == key) {
                    e.value = value;
                    return;
                }

                if (k == null) {
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }

            tab[i] = new Entry(key, value);
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();
        }
    }

所以实际ThreadLocalset方法是将对象存储到了调用该ThreadLocal的线程对象的threadLocals成员中. 而该成员的类型为ThreadLocalMap. 注意ThreadLocalMapset方法, 其key的类型是任何类型的ThreadLocal对象. 所以ThreadLocalMap对象存储了ThreadLocal -> value的键值对. 因为一个线程可能使用多个ThreadLocal对象, 所以使用了ThreadLocalMap来管理这些值. 这也解释了ThreadLocalset方法中map.set(this, value);这句代码的意思.

再来看ThreadLoccal类的get方法:

/**
     * Returns the value in the current thread's copy of this
     * thread-local variable.  If the variable has no value for the
     * current thread, it is first initialized to the value returned
     * by an invocation of the {@link #initialValue} method.
     *
     * @return the current thread's value of this thread-local
     */
    public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }

其实就是获取当先的线程, 然后得到其ThreadLocalMap类型的threadLocals对象. 然后传递this, 即用于表明当前是要取得threadLocalskey为当前这个ThreadLocal的对象.

关于内存泄露

  • 深入分析 ThreadLocal 内存泄漏问题

原因在上面那篇文章说得很清楚了.
接下来说一个关于ThreadLocal.remove()方法的实践. 虽然有些情况不会造成内存泄露, 我们可以不调用ThreadLocal.remove()方法. 但是这可能会造成一些其他问题, 比如说当线程被线程池重用的时候. 如果线程在使用完ThreadLocal后没有remove, 那么很可能下次该线程再次执行的时候(可能是不同任务了), 就可能会读到一个之前设置过的值.

你可能感兴趣的:(Java ThreadLocal)