最近需要在网关层做一个限流的需求,由于需要对一个机房内的集群做统一的限流管理,所以可能需要用到redis,而且spring cloud本身的Hystrix也不行了(因为Hystrix只是针对单机)。因此需要考虑自行实现限流逻辑,所以针对目前比较主流的限流 令牌桶算法及其实现(Guava RateLimiter)做了一个调研。
令牌桶(Token Bucket)
令牌桶算法是网络流量整形(Traffic Shaping)和速率控制(Rate Limiting)中比较常用的一种算法,还有另一种漏桶算法,这里就不多展开了。一般来说,令牌桶算法可以控制网络中单位时间内的请求数目,并在一定程度上允许突发数据的产生。
保证QPS为稳定速率的最简单的方式是保存上一个授权请求的时间戳,然后保证在接下来的1/QPS 秒内没有其他请求进入。举例来说,对于QPS=5的需求,如果我们能保证没有请求能够在上个请求之后的200ms内获得授权,那我们就实现了限流。如果有个请求在上个授权请求之后的100ms到来,那么我们需要做的就是让它再等待100ms。以此类推,对于并发的15个请求,总共会花掉3秒钟。
- stored permits (过去留存的令牌,当有低利用率存在就能使用)
- fresh permits(stored permits之后还有剩余的permits,我们认为他需要新鲜fresh的permit令牌来保证)
我们有一个限流器每秒产生一个令牌,即保证QPS=1。如果有一秒限流器没有请求进来,那么我们对storedPermits加1。假定在过去的10秒内都没有请求进来,那storedPermits就会增加至10(假定maxStoredPermits>10)。这时一个请求到来并申请获取3个令牌,我们可以直接从storedPermits中提出来3个令牌支持,并且storedPermits减少到7。在此之后又来了一个请求申请获取10个令牌,这时候我们从storedPermits中直接提取完剩下来的7个令牌,余下的3个令牌我们需要等待限流器放入3个fresh pertmits才能完成这个请求的授权访问。
我们知道我们的QPS=1,所以我们需要等待3秒才能拿到新的3个fresh permits。那我们拿到7个stored permits需要多少时间呢?根据上面的讨论,这个问题没有一个标准答案。如果我们想加速处理来快速填满过去低利用率带来的损失,那我们肯定希望我们拿到stored permits的速度快于fresh permits,因为低利用率代表了更多空闲资源可以利用。如果我们主要的关注点在防溢出上,那stored permits的提取速度应该要比fresh permits要慢。因此我们需要一个函数来衡量stored permits和受控制等待时间之间的关联。这个函数就是storedPermitsToWaitTime。(后面的描述会比较复杂,这里不进行深入展开了。对于我们一般使用的SmoothBursty,这个函数恒定返回0,即立即获取storedPermits)。
这个策略产生了非常重要的结果,就是限流器不会记录最近的请求时间,而是记录下一个请求可用的期望时间。这也保证了我们有能力判断在一个timeout时间段内一个请求是否能够获取到令牌。另一方面,根据这个期望时间,我们可以很好地判断一个限流器的未使用时间,一旦这个期望时间在当前时间之前,那么当前时间与期望时间的差值就是限流器未使用的时长,而这个时长也可以转换到stored permits上(根据前文所述storedPermits随着空闲时间增长)
* The currently stored permits. 目前保存下来的令牌数目
double storedPermits;
* The maximum number of stored permits.最大的令牌保存量,即桶大小
double maxPermits;
* The interval between two unit requests, at our stable rate. E.g., a stable rate of 5 permits
* per second has a stable interval of 200ms.
double stableIntervalMicros;
* The time when the next request (no matter its size) will be granted. After granting a
* request, this is pushed further in the future. Large requests push this further than small
* requests.
private long nextFreeTicketMicros = 0L; // could be either in the past or future 有可能在过去或者将来
* The underlying timer; used both to measure elapsed time and sleep as necessary. A separate
* object to facilitate testing.
private final SleepingStopwatch stopwatch;
// Can't be initialized in the constructor because mocks don't call the constructor. 非直接用的互斥锁
private volatile Object mutexDoNotUseDirectly;
private Object mutex() {
Object mutex = mutexDoNotUseDirectly;
if (mutex == null) {
synchronized (this) {
mutex = mutexDoNotUseDirectly;
if (mutex == null) {
mutexDoNotUseDirectly = mutex = new Object();
return mutex;
* Acquires a single permit from this {@code RateLimiter}, blocking until the
* request can be granted. Tells the amount of time slept, if any.
* This method is equivalent to {@code acquire(1)}.
* @return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
* @since 16.0 (present in 13.0 with {@code void} return type})
public double acquire() {
return acquire(1);
* Acquires the given number of permits from this {@code RateLimiter}, blocking until the
* request can be granted. Tells the amount of time slept, if any.
* @param permits the number of permits to acquire
* @return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
* @throws IllegalArgumentException if the requested number of permits is negative or zero
* @since 16.0 (present in 13.0 with {@code void} return type})
public double acquire(int permits) {
long microsToWait = reserve(permits);
return 1.0 * microsToWait / SECONDS.toMicros(1L);
* Reserves the given number of permits from this {@code RateLimiter} for future use, returning
* the number of microseconds until the reservation can be consumed.
* @return time in microseconds to wait until the resource can be acquired, never negative
final long reserve(int permits) {
synchronized (mutex()) {
return reserveAndGetWaitLength(permits, stopwatch.readMicros());
* Reserves next ticket and returns the wait time that the caller must wait for.
* @return the required wait time, never negative
final long reserveAndGetWaitLength(int permits, long nowMicros) {
long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
return max(momentAvailable - nowMicros, 0);
* Updates {@code storedPermits} and {@code nextFreeTicketMicros} based on the current time.
* 这个函数功能是在每次请求调用产生时更新限流器的令牌数
void resync(long nowMicros) {
// if nextFreeTicket is in the past, resync to now
// 如果下次能授权的毫秒数在现在的毫秒计数之前
// 说明这个限流器已经有一段时间没有使用了
// 需要计算这段时间产生的stored permits
// 否则说明这段时间限流器一直有请求进来,则不需要更新
if (nowMicros > nextFreeTicketMicros) {
//stored permits 最多为maxPermits,
//大小根据这段空闲时间长度(nowMicros - nextFreeTicketMicros)确定
storedPermits = min(maxPermits,
+ (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros());
nextFreeTicketMicros = nowMicros;
直接返回放入令牌间隔,即 1 / QPS * 1000(毫秒)
double coolDownIntervalMicros() {
return stableIntervalMicros;
在当前场景下,对于storedPermits,我们的策略是立即获取,因此没有wait time,返回0
long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
return 0L;
final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
// 更新令牌桶
long returnValue = nextFreeTicketMicros;
double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
//计算需要等待获取的fresh permits
double freshPermits = requiredPermits - storedPermitsToSpend;
//总的等待时间等于storedPermits的等待时间加上fresh permit的等待时间
//fresh的等待时间就是放入令牌的间隔*fresh permits数目
long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
+ (long) (freshPermits * stableIntervalMicros);
// 增加nextFreeTicketMicros, 这里支持预定
try {
this.nextFreeTicketMicros = LongMath.checkedAdd(nextFreeTicketMicros, waitMicros);
} catch (ArithmeticException e) {
this.nextFreeTicketMicros = Long.MAX_VALUE;
//更新stored permits
this.storedPermits -= storedPermitsToSpend;
return returnValue;
这段代码的讲解我大部分已经写在代码注释里面了,需要说明的是,我最开始一直在想按照令牌桶算法的描述,应该有一个定时插入令牌的过程,但是我看了下确实没有多的线程同步机制来做这个事儿,原来Guava中采用了触发式的更新令牌桶机制。原理就是在每次请求到来的时候去完成令牌桶中令牌插入工作和其他属性如nextFreeTicketMicros的更新工作,这样减少了线程使用, 节约了资源,并且也简化了操作。这个功能在resync函数代码中完成。需要值得注意的是,因为Guava的实现支持令牌预定功能,即当限流器当前处于空闲状态时,一个大量令牌请求进来的时候,可以提前预授权给他足够的令牌让它能够立即执行,并推迟后续请求的等待时间(如之前所述),因此才会出现nowMicros < nextFreeTicketMicro的情况,而这种情况就说明当前仍处于对于之前一个请求的预授权阶段,不需要更新storedPermits,否则就还是nowMicros >= nextFreeTicketMicro的情况。
* Acquires a permit from this {@code RateLimiter} if it can be obtained
* without exceeding the specified {@code timeout}, or returns {@code false}
* immediately (without waiting) if the permit would not have been granted
* before the timeout expired.
* This method is equivalent to {@code tryAcquire(1, timeout, unit)}.
* @param timeout the maximum time to wait for the permit. Negative values are treated as zero.
* @param unit the time unit of the timeout argument
* @return {@code true} if the permit was acquired, {@code false} otherwise
* @throws IllegalArgumentException if the requested number of permits is negative or zero
public boolean tryAcquire(long timeout, TimeUnit unit) {
return tryAcquire(1, timeout, unit);
* Acquires permits from this {@link RateLimiter} if it can be acquired immediately without delay.
* This method is equivalent to {@code tryAcquire(permits, 0, anyUnit)}.
* @param permits the number of permits to acquire
* @return {@code true} if the permits were acquired, {@code false} otherwise
* @throws IllegalArgumentException if the requested number of permits is negative or zero
* @since 14.0
public boolean tryAcquire(int permits) {
return tryAcquire(permits, 0, MICROSECONDS);
* Acquires a permit from this {@link RateLimiter} if it can be acquired immediately without
* delay.
* This method is equivalent to {@code tryAcquire(1)}.
* @return {@code true} if the permit was acquired, {@code false} otherwise
* @since 14.0
public boolean tryAcquire() {
return tryAcquire(1, 0, MICROSECONDS);
* Acquires the given number of permits from this {@code RateLimiter} if it can be obtained
* without exceeding the specified {@code timeout}, or returns {@code false}
* immediately (without waiting) if the permits would not have been granted
* before the timeout expired.
* @param permits the number of permits to acquire
* @param timeout the maximum time to wait for the permits. Negative values are treated as zero.
* @param unit the time unit of the timeout argument
* @return {@code true} if the permits were acquired, {@code false} otherwise
* @throws IllegalArgumentException if the requested number of permits is negative or zero
public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {
long timeoutMicros = max(unit.toMicros(timeout), 0);
long microsToWait;
synchronized (mutex()) {
long nowMicros = stopwatch.readMicros();
if (!canAcquire(nowMicros, timeoutMicros)) {
return false;
} else {
microsToWait = reserveAndGetWaitLength(permits, nowMicros);
// sleep直到能获取令牌
return true;
private boolean canAcquire(long nowMicros, long timeoutMicros) {
return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;
final long queryEarliestAvailable(long nowMicros) {
return nextFreeTicketMicros;
以上就是针对Guava RateLimiter的代码和限流逻辑的一个整体梳理,主要是针对SmoothBursty的实现来做的一个分析。希望大家能够喜欢,后续可能需要考虑针对多机做一个类似的机制。