环形队列、 条带环形队列 Striped-RingBuffer (史上最全)

高性能 BoundedBuffer 条带环形队列

Caffeine 源码中,用到几个高性能数据结构要讲

  • 一个是 条带环状 队列 (超高性能、无锁队列
  • 一个是mpsc队列 (超高性能、无锁队列
  • 一个是 多级时间轮

这里给大家 介绍 环形队列、 条带环形队列 Striped-RingBuffer 。

剩下的两个结构, 稍后一点 ,使用专门的 博文介绍。

CAS 的优势与核心问题


抢占与释放的过程中,涉及到 进程的 用户态和 内核态, 进程的 用户空间 和内核空间之间的切换, 性能非常低。

而CAS进行自旋抢锁,这些CAS操作都处于用户态下,进程不存在用户态和内核态之间的运行切换,因此JVM轻量级锁开销较小。这是 CAS 的优势。

但是, 任何事情,都有两面性。

CAS 的核心问题是什么呢?




除了存在CAS空自旋之外,在SMP架构的CPU平台上,大量的CAS操作还可能导致“总线风暴”,具体可参见《Java高并发核心编程 卷2 加强版》第5章的内容。

在高并发场景下如何提升CAS操作性能/ 解决CAS恶性空自旋 问题呢?


  • 分散操作热点、
  • 使用队列削峰。

比如,在自增的场景中, 可以使用LongAdder替代AtomicInteger。

这是一种 分散操作热点 ,空间换时间 方案,

也是 分而治之的思想。

以空间换时间:LongAdder 以及 Striped64

Java 8提供一个新的类LongAdder,以空间换时间的方式提升高并发场景下CAS操作性能。



环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第1张图片

图3-10 LongAdder的操作对象由单个value值“演变”成了数组



环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第2张图片

LongAdder 继承了 Striped64,核心源码在 Striped64中。

环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第3张图片


环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第4张图片

 * A package-local class holding common representation and mechanics
 * for classes supporting dynamic striping on 64bit values. The class
 * extends Number so that concrete subclasses must publicly do so.
abstract class Striped64 extends Number {

     * Padded variant of AtomicLong supporting only raw accesses plus CAS.
     * JVM intrinsics note: It would be possible to use a release-only
     * form of CAS here, if it were provided.
    @sun.misc.Contended static final class Cell {
        volatile long value;
        Cell(long x) { value = x; }
        final boolean cas(long cmp, long val) {
            return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val);

        // Unsafe mechanics
        private static final sun.misc.Unsafe UNSAFE;
        private static final long valueOffset;
        static {
            try {
                UNSAFE = sun.misc.Unsafe.getUnsafe();
                Class ak = Cell.class;
                valueOffset = UNSAFE.objectFieldOffset
            } catch (Exception e) {
                throw new Error(e);

    /** Number of CPUS, to place bound on table size */
    static final int NCPU = Runtime.getRuntime().availableProcessors();

     * Table of cells. When non-null, size is a power of 2.
    transient volatile Cell[] cells;

     * Base value, used mainly when there is no contention, but also as
     * a fallback during table initialization races. Updated via CAS.
    transient volatile long base;

     * Spinlock (locked via CAS) used when resizing and/or creating Cells.
    transient volatile int cellsBusy;

     * Package-private default constructor
    Striped64() {

以上源码的特别复杂,请参见 《Java高并发核心编程 卷2 加强版》

环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第5张图片

BoundedBuffer 的核心源码

 * A striped, non-blocking, bounded buffer.
 * @author [email protected] (Ben Manes)
 * @param  the type of elements maintained by this buffer
final class BoundedBuffer extends StripedBuffer

它是一个 striped、非阻塞、有界限的 buffer,继承于StripedBuffer类。


 * A base class providing the mechanics for supporting dynamic striping of bounded buffers. This
 * implementation is an adaption of the numeric 64-bit {@link java.util.concurrent.atomic.Striped64}
 * class, which is used by atomic counters. The approach was modified to lazily grow an array of
 * buffers in order to minimize memory usage for caches that are not heavily contended on.
 * @author [email protected] (Doug Lea)
 * @author [email protected] (Ben Manes)

abstract class StripedBuffer implements Buffer

StripedBuffer (条带缓冲)的架构


  • 分散操作热点、
  • 使用队列削峰。



每个线程用自己id属性作为 hash 值的种子产生hash值,这样就相当于每个线程都有自己“专属”的RingBuffer,


环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第6张图片


/** Table of buffers. When non-null, size is a power of 2. */
transient volatile Buffer @Nullable[] table;

static final long TABLE_BUSY = UnsafeAccess.objectFieldOffset(StripedBuffer.class, "tableBusy");
static final long PROBE = UnsafeAccess.objectFieldOffset(Thread.class, "threadLocalRandomProbe");

/** Number of CPUS. */
static final int NCPU = Runtime.getRuntime().availableProcessors();

/** The bound on the table size. */
static final int MAXIMUM_TABLE_SIZE = 4 * ceilingNextPowerOfTwo(NCPU);

/** The maximum number of attempts when trying to expand the table. */
static final int ATTEMPTS = 3;

/** Table of buffers. When non-null, size is a power of 2. */
transient volatile Buffer @Nullable[] table;

/** Spinlock (locked via CAS) used when resizing and/or creating Buffers. */
transient volatile int tableBusy;

/** CASes the tableBusy field from 0 to 1 to acquire lock. */
final boolean casTableBusy() {
  return UnsafeAccess.UNSAFE.compareAndSwapInt(this, TABLE_BUSY, 0, 1);

 * Returns the probe value for the current thread. Duplicated from ThreadLocalRandom because of
 * packaging restrictions.
static final int getProbe() {
  return UnsafeAccess.UNSAFE.getInt(Thread.currentThread(), PROBE);

offer方法,当没初始化或存在竞争时,则扩容为 2 倍。最大为不小于 CPU核数的 2幂值。

     * The bound on the table size.
    static final int MAXIMUM_TABLE_SIZE = 4 * ceilingPowerOfTwo(NCPU);

实际是调用RingBuffer的 offer 方法,把数据追加到RingBuffer后面。

public int offer(E e) {
  int mask;
  int result = 0;
  Buffer buffer;
  boolean uncontended = true;
  Buffer[] buffers = table
  if ((buffers == null)
      || (mask = buffers.length - 1) < 0
      || (buffer = buffers[getProbe() & mask]) == null
      || !(uncontended = ((result = buffer.offer(e)) != Buffer.FAILED))) {
    expandOrRetry(e, uncontended);
  return result;

 * Handles cases of updates involving initialization, resizing, creating new Buffers, and/or
 * contention. See above for explanation. This method suffers the usual non-modularity problems of
 * optimistic retry code, relying on rechecked sets of reads.
 * @param e the element to add
 * @param wasUncontended false if CAS failed before call

final void expandOrRetry(E e, boolean wasUncontended) {
  int h;
  if ((h = getProbe()) == 0) {
    ThreadLocalRandom.current(); // force initialization
    h = getProbe();
    wasUncontended = true;
  boolean collide = false; // True if last slot nonempty
  for (int attempt = 0; attempt < ATTEMPTS; attempt++) {
    Buffer[] buffers;
    Buffer buffer;
    int n;
    if (((buffers = table) != null) && ((n = buffers.length) > 0)) {
      if ((buffer = buffers[(n - 1) & h]) == null) {
        if ((tableBusy == 0) && casTableBusy()) { // Try to attach new Buffer
          boolean created = false;
          try { // Recheck under lock
            Buffer[] rs;
            int mask, j;
            if (((rs = table) != null) && ((mask = rs.length) > 0)
                && (rs[j = (mask - 1) & h] == null)) {
              rs[j] = create(e);
              created = true;
          } finally {
            tableBusy = 0;
          if (created) {
          continue; // Slot is now non-empty
        collide = false;
      } else if (!wasUncontended) { // CAS already known to fail
        wasUncontended = true;      // Continue after rehash
      } else if (buffer.offer(e) != Buffer.FAILED) {
      } else if (n >= MAXIMUM_TABLE_SIZE || table != buffers) {
        collide = false; // At max size or stale
      } else if (!collide) {
        collide = true;
      } else if (tableBusy == 0 && casTableBusy()) {
        try {
          if (table == buffers) { // Expand table unless stale
            table = Arrays.copyOf(buffers, n << 1);
        } finally {
          tableBusy = 0;
        collide = false;
        continue; // Retry with expanded table
      h = advanceProbe(h);
    } else if ((tableBusy == 0) && (table == buffers) && casTableBusy()) {
      boolean init = false;
      try { // Initialize table
        if (table == buffers) {
          @SuppressWarnings({"unchecked", "rawtypes"})
          Buffer[] rs = new Buffer[1];
          rs[0] = create(e);
          table = rs;
          init = true;
      } finally {
        tableBusy = 0;
      if (init) {





环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第7张图片


环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第8张图片


从顺时针看,环形队列 有队头 head 和队尾 tail。


生产者顺时针向队尾 tail 插入元素,这会导致 head 位置不变,tail 位置在后移;


消费者则从队头 head 开始消费,这会导致 head 向后移动,而tail 位置不变,如果队列满了就不能写入。


队头 head 和队尾 tail 的位置是不定的,位置一直在循环流动,空间就被重复利用起来了。




下面的环形队列, 参考了 缓存之王 Caffeine 源码中的 命名

package com.crazymakercircle.queue;

public class SimpleRingBufferDemo {
    public static void main(String[] args) {

        SimpleRingBuffer queue = new SimpleRingBuffer(4);
        System.out.println("queue = " + queue);
        int temp = queue.poll();
        System.out.println("temp = " + temp);
        System.out.println("queue = " + queue);
        temp = queue.poll();
        System.out.println("temp = " + temp);
        System.out.println("queue = " + queue);
        temp = queue.poll();
        System.out.println("temp = " + temp);
        System.out.println("queue = " + queue);


class SimpleRingBuffer {
    private int maxSize;//表示数组的最大容量
    private int head;  // 模拟 缓存之王 Caffeine 源码命名
    private int tail; // 模拟 缓存之王 Caffeine 源码命名
    private int[] buffer;//该数据用于存放数据

    public SimpleRingBuffer(int arrMaxSize) {
        maxSize = arrMaxSize;
        buffer = new int[maxSize];

    public boolean isFull() {
        return (tail + 1) % maxSize == head;

    public boolean isEmpty() {
        return tail == head;

    public void offer(int n) {
        if (isFull()) {
        buffer[tail] = n;
        tail = (tail + 1) % maxSize;

    public int poll() {
        if (isEmpty()) {
            throw new RuntimeException("队列空,不能取数据");
        int value = buffer[head];
        head = (head + 1) % maxSize;
        return value;

    public int size() {
        return (tail + maxSize - head) % maxSize;

    public String toString() {
       return   String.format("head=%d , tail =%d\n",head,tail);



环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第9张图片


  1. 约定head指向队列的第一个元素


  2. 约定tail指向队列的最后一个元素的后一个位置


  3. 队列满的条件是:

    ( tail+1 )% maxSize == head

  4. 队列空的条件是:

    tail == head

  5. 队列中的元素个数为:

    ( tail + maxsize - head) % maxSize

  6. 有效数据只有maxSize-1个



写入的时候,当前位置的下一位置是(tail+1)% maxSize


当head刚好指向tail的下一个位置时队列满,而tail的下一个位置是 (tail+1)% maxSize

所以当( tail + 1 )% maxSize == head 时,队列就满了。

环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第10张图片


队列为空的情况如下图所示,当队头队尾都指向一个位置,即 head == tail 时,队列为空。

环形队列、 条带环形队列 Striped-RingBuffer (史上最全)_第11张图片

当head == tail时,队列为空


因此, 环形队列的有效数据只有maxSize-1个

RingBuffer 源码

caffeine源码中, 注意RingBuffer是BoundedBuffer的内部类。

/** The maximum number of elements per buffer. */
static final int BUFFER_SIZE = 16;

// Assume 4-byte references and 64-byte cache line (16 elements per line)
static final int SPACED_SIZE = BUFFER_SIZE << 4;
static final int SPACED_MASK = SPACED_SIZE - 1;
static final int OFFSET = 16;
final AtomicReferenceArray buffer;

 public int offer(E e) {
   long head = readCounter;
   long tail = relaxedWriteCounter();
   long size = (tail - head);
   if (size >= SPACED_SIZE) {
     return Buffer.FULL;
   if (casWriteCounter(tail, tail + OFFSET)) {
     int index = (int) (tail & SPACED_MASK);
     buffer.lazySet(index, e);
     return Buffer.SUCCESS;
   return Buffer.FAILED;

 public void drainTo(Consumer consumer) {
   long head = readCounter;
   long tail = relaxedWriteCounter();
   long size = (tail - head);
   if (size == 0) {
   do {
     int index = (int) (head & SPACED_MASK);
     E e = buffer.get(index);
     if (e == null) {
       // not published yet
     buffer.lazySet(index, null);
     head += OFFSET;
   } while (head != tail);

注意,ring buffer 的 size(固定是 16 个)是不变的,变的是 head 和 tail 而已。

Striped-RingBuffer 有如下特点:

总的来说 Striped-RingBuffer 有如下特点:

  • 使用 Striped-RingBuffer来提升对 buffer 的读写
  • 用 thread 的 hash 来避开热点 key 的竞争
  • 允许写入的丢失


Guava Cache主页:https://github.com/google/guava/wiki/CachesExplained





Caffeine: https://github.com/ben-manes/caffeine

这里: https://albenw.github.io/posts/df42dc84/

Benchmarks: https://github.com/ben-manes/caffeine/wiki/Benchmarks
