5.6 构建高效且可伸缩的结果缓存
缓存技术在CPU中通过快表来实现虚拟地址到物理地址的快速转换,把页地址和快表中的每一项进行基于内容的并行比较得出物理的页地址。另外计算机还有L1、L2、L3等等多级缓存来提高计算速度。
缓存遵循的局部性原理,之前的计算结果在未来可能会很用到。
几乎所有的服务器应用程序都会使用某种形式的缓存技术。想想Google和Baidu,搜索结果的速度如果没有缓存那么得多久?重用之前的计算结果能够降低延迟,提高吞吐量,但是这也是使用空间来换取时间,需要更多的内存。
下面一步步来实现一个高效且可伸缩的结果缓存。
例一
使用HashMap和同步机制来初始化缓存
package com.imeiren.cache; public interface Computable<A, V> { V compute(A arg) throws InterruptedException; }
package com.imeiren.cache; import java.math.BigInteger; public class ExpensiveFunction implements Computable<String, BigInteger>{ @Override public BigInteger compute(String arg) throws InterruptedException { //模拟长时间的计算 Thread.sleep(2000); return new BigInteger(arg); } }
package com.imeiren.cache; import java.util.HashMap; import java.util.Map; public class Memorizer1<A, V> implements Computable<A, V> { private final Map<A, V> cache = new HashMap<A, V>(); private final Computable<A, V> c; public Memorizer1(Computable<A, V> c) { this.c = c; } @Override public synchronized V compute(A arg) throws InterruptedException { V result = cache.get(arg); if (result == null) { result = c.compute(arg); cache.put(arg, result); } return result; } }
package com.imeiren.cache; import java.math.BigInteger; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; class Worker implements Runnable { Memorizer1<String, BigInteger> memorizer; public Worker(Memorizer1<String, BigInteger> memorizer) { this.memorizer = memorizer; } @Override public void run() { long start = System.nanoTime(); String arg = "0958723945872304570198354"; try { BigInteger result = memorizer.compute(arg); long end = System.nanoTime(); System.out.println(Thread.currentThread() + ":" + (end - start)+"ns"); } catch (InterruptedException e) { e.printStackTrace(); } } } public class Memorizer1Tester { public static void main(String[] args) { Computable<String, BigInteger> c = new ExpensiveFunction(); final Memorizer1<String, BigInteger> memorizer = new Memorizer1<String, BigInteger>( c); ExecutorService executorService = Executors.newCachedThreadPool(); for (int i = 0; i < 5; i++) { executorService.execute(new Worker(memorizer)); try { Thread.sleep(2500); } catch (InterruptedException e) { e.printStackTrace(); } } executorService.shutdown(); } }例子中使用HashMap来缓存一个模拟的耗时计算,为了测试缓存带来的性能提升,我们的主线程每2.5s提交一个计算任务到线程池里执行。我们模拟的耗时计算是需要2s,故基本上可以确定这提交的任务是串行执行。某次的执行结果是:
Thread[pool-1-thread-1,5,main]:1999862757ns Thread[pool-1-thread-1,5,main]:3871ns Thread[pool-1-thread-1,5,main]:3519ns Thread[pool-1-thread-1,5,main]:3871ns Thread[pool-1-thread-1,5,main]:3520ns从线程的名称可以看出来线程池只用了一个线程来执行提交上来的任务,实现线程的复用。从这看出我们提交的任务是串行执行。我们再从任务执行的耗时上面来看缓存带来的性能提升,第一个任务基本是2s钟完成,后续的4个任务都只需要大概3650ns就完成了。
但是我们的重点是并发,那我们就来看看并发下例一的执行情况。我们把主线程中的睡眠代码去掉,所有任务几乎同时提交给线程池。
ExecutorService executorService = Executors.newCachedThreadPool(); for (int i = 0; i < 5; i++) { executorService.execute(new Worker(memorizer)); } executorService.shutdown();在这种情况下,某次的执行结果是:
Thread[pool-1-thread-1,5,main]:1999793432ns Thread[pool-1-thread-4,5,main]:1999486217ns Thread[pool-1-thread-3,5,main]:1999660411ns Thread[pool-1-thread-2,5,main]:1999791319ns Thread[pool-1-thread-5,5,main]:1999348975ns从线程的名称看出,线程池开了5个线程来执行这5个并发的任务。每个任务从提交到任务完成所需要的时间是2s左右。由于我们给Memorizer的compute方法加了synchronized,同一时刻只能有一个线程访问HashMap,这样虽然线程是非常安全,但是却带来了明显的可伸缩性问题。能不能同一时刻支持多个线程访问缓存呢?
例二
用ConcurrentHashMap来代替HashMap
package com.imeiren.cache; import java.util.Map; import java.util.concurrent.ConcurrentHashMap; public class Memorizer2<A, V> implements Computable<A, V> { private final Map<A, V> cache = new ConcurrentHashMap<A, V>(); private final Computable<A, V> c; public Memorizer2(Computable<A, V> c) { this.c = c; } @Override public V compute(A arg) throws InterruptedException { V result = cache.get(arg); if (result == null) { System.out.println(Thread.currentThread() + " 未命中"); result = c.compute(arg); cache.put(arg, result); } return result; } }某次执行的结果是:
Thread[pool-1-thread-1,5,main] 未命中 Thread[pool-1-thread-2,5,main] 未命中 Thread[pool-1-thread-3,5,main] 未命中 Thread[pool-1-thread-4,5,main] 未命中 Thread[pool-1-thread-5,5,main] 未命中 Thread[pool-1-thread-1,5,main]:1999726569ns Thread[pool-1-thread-2,5,main]:2000209736ns Thread[pool-1-thread-3,5,main]:2000175953ns Thread[pool-1-thread-4,5,main]:2000066158ns Thread[pool-1-thread-5,5,main]:1999923284nsMemorizer2比Memorizer1有着更好的并发行为。但是从提交到得到结果还是需要2s钟,而且5个线程都对相同的数据进行了计算。这违背了缓存的作用。这里的问题在于某个线程开始了某个计算,而后面的线程并不知道这个计算在运行,所以会重复计算。能不能告诉后面到来的线程我在计算什么?
例三
基于FutureTask的Memorizer
package com.imeiren.cache; import java.util.Map; import java.util.concurrent.Callable; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ExecutionException; import java.util.concurrent.Future; import java.util.concurrent.FutureTask; public class Memorizer3<A, V> implements Computable<A, V> { private final Map<A, Future<V>> cache = new ConcurrentHashMap<A, Future<V>>(); private final Computable<A, V> c; public Memorizer3(Computable<A, V> c) { this.c = c; } @Override public V compute(final A arg) throws InterruptedException { Future<V> f = cache.get(arg); if (f == null) { System.out.println(Thread.currentThread() + " 未命中"); Callable<V> eval = new Callable<V>() { @Override public V call() throws Exception { return c.compute(arg); } }; FutureTask<V> ft = new FutureTask<V>(eval); f = ft; cache.put(arg, f); ft.run(); } try { return f.get(); } catch (ExecutionException e) { e.printStackTrace(); } return null; } }某次执行结果如下:
Thread[pool-1-thread-1,5,main] 未命中 Thread[pool-1-thread-3,5,main] 未命中 Thread[pool-1-thread-2,5,main] 未命中 Thread[pool-1-thread-4,5,main] 未命中 Thread[pool-1-thread-5,5,main] 未命中 Thread[pool-1-thread-5,5,main]:2000420879ns Thread[pool-1-thread-3,5,main]:2000829794ns Thread[pool-1-thread-4,5,main]:2000550029ns Thread[pool-1-thread-2,5,main]:2000832258ns Thread[pool-1-thread-1,5,main]:2000916363ns我们看到依然是5个线程都进行了计算!为了测试这个方法与上个方法的区别,我们需要主线程在每个任务提交之间睡眠小于2s的时间。如:
ExecutorService executorService = Executors.newCachedThreadPool(); for (int i = 0; i < 5; i++) { executorService.execute(new Worker(memorizer)); try { Thread.sleep(25); } catch (InterruptedException e) { e.printStackTrace(); } } executorService.shutdown();比如我们睡眠25ms。某次执行的结果是:
Thread[pool-1-thread-1,5,main] 未命中 Thread[pool-1-thread-1,5,main]:2000471202ns Thread[pool-1-thread-3,5,main]:1951198750ns Thread[pool-1-thread-2,5,main]:1976159785ns Thread[pool-1-thread-5,5,main]:1901277032ns Thread[pool-1-thread-4,5,main]:1926302817ns从这个结果可以看出只有一个线程进行了计算,性能比上一个方法好了一点。那么有没有更好的方法?
例四
Memorizer的最终实现
package com.imeiren.cache; import java.util.concurrent.Callable; import java.util.concurrent.CancellationException; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ExecutionException; import java.util.concurrent.Future; import java.util.concurrent.FutureTask; public class Memorizer<A, V> implements Computable<A, V> { private final ConcurrentHashMap<A, Future<V>> cache = new ConcurrentHashMap<A, Future<V>>(); private final Computable<A, V> c; public Memorizer(Computable<A, V> c) { this.c = c; } @Override public V compute(final A arg) throws InterruptedException { while (true) { Future<V> f = cache.get(arg); if (f == null) { System.out.println(Thread.currentThread() + " 未命中"); Callable<V> eval = new Callable<V>() { @Override public V call() throws Exception { return c.compute(arg); } }; FutureTask<V> ft = new FutureTask<V>(eval); // returns the previous value associated with the specified key, // or null // if there was no mapping for the key f = cache.putIfAbsent(arg, ft); if (f == null) { System.out.println(Thread.currentThread() + " putIfAbsent return null"); f = ft; ft.run(); } } try { return f.get(); } catch (CancellationException e) { cache.remove(arg, f); } catch (ExecutionException e) { e.printStackTrace(); } } } }某次执行的结果是:
Thread[pool-1-thread-1,5,main] 未命中 Thread[pool-1-thread-2,5,main] 未命中 Thread[pool-1-thread-3,5,main] 未命中 Thread[pool-1-thread-4,5,main] 未命中 Thread[pool-1-thread-5,5,main] 未命中 Thread[pool-1-thread-2,5,main] putIfAbsent return null Thread[pool-1-thread-2,5,main]:2001221818ns Thread[pool-1-thread-4,5,main]:2000969501ns Thread[pool-1-thread-3,5,main]:2001063459ns Thread[pool-1-thread-1,5,main]:2001429090ns Thread[pool-1-thread-5,5,main]:2000874486ns主线程提交任务之间并没有进入睡眠。从结果来看,线程池开了5个线程执行任务。开始的时候都是未命中缓存,但是相同数据的计算只交给一个线程来执行,从结果来看此次是交给了线程2来计算。因此避免了前面例子中计算相同的数据的情况。另外当一个正在计算的任务被取消的时候,及时从缓存中清除,解决了缓存污染(Cache Pollution)问题。