Java实现数值型ID生成器


基于Twitter ID 生成策略

  1. 每秒能生成几十万条 ID

ID 生成要以一种非协作的(uncoordinated)的方式进行,例如不能利用一个全局的原子变量。

  1. ID 大致有序,就是说生成时间相近的两个ID,它们的值也应当相近

    按 ID 排序后满足 k-sorted 条件。如果序列 A 要满足 k-sorted,当且仅当对于任意的 p, q,如果 1 <= p <= q - k (1 <= p <= q <= n),则有 A[p] <= A[q]。换句话说,如果元素 p 排在 q 前面,且相差至少 k 个位置,那么 p 必然小于或等于 q。如果ID序列满足这个条件,要获取第 r 条ID之后的记录,只要从第 r - k 条开始查找即可。

  2. 解决方案
    Twitter 解决这两个问题的方案非常简单高效:每一个 ID 都是 64 位数字,由时间戳、节点号和序列编号组成。其中序列编号是每个节点本地生成的序号,而节点号则由 ZooKeeper 维护。

  3. 实现代码

/**
 * @author zhujuan
 * From: https://github.com/twitter/snowflake
 * An object that generates IDs.
 * This is broken into a separate class in case
 * we ever want to support multiple worker threads
 * per process
 */
public class IdWorker {
     
    protected static final Logger LOG = LoggerFactory.getLogger(IdWorker.class);
     
    private long workerId;
    private long datacenterId;
    private long sequence = 0L;
 
    private long workerIdBits = 5L;
    private long datacenterIdBits = 5L;
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    private long sequenceBits = 12L;
 
    private long workerIdShift = sequenceBits;
    private long datacenterIdShift = sequenceBits + workerIdBits;
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private long sequenceMask = -1L ^ (-1L << sequenceBits);
 
    private long lastTimestamp = -1L;
 
    public IdWorker(long workerId, long datacenterId) {
        // sanity check for workerId
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
        LOG.info(String.format("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d", timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId));
    }
 
    public synchronized long nextId() {
        long timestamp = timeGen();
 
        if (timestamp < lastTimestamp) {
            LOG.error(String.format("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp));
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }
 
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
 
        lastTimestamp = timestamp;
 
        return (timestamp << timestampLeftShift) | (datacenterId << datacenterIdShift) | (workerId << workerIdShift) | sequence;
    }
 
}

基于Instagram 的ID生成策略

  1. 生成的ID可以按时间排序
    与Twitter需求大致相同

  2. ID最好是64bit的
    为了索引更小且方便存储在像Redis这样的系统中

  3. 按照某种用户标识进行逻辑分片

  4. 解决方案

  • 41bits 存储毫秒格式的时间
  • 10bits 表示逻辑分片ID
    原方案是13bits(最多8192个逻辑分片),这里为了与基于Twitter的策略保持大致一致,改成了10bits
  • 12bits 存储自增序列值
    原方案是10bits(最多1024个序列),这里为了与基于Twitter的策略保持大致一致,改成了12bits(最多4096个序列)
  1. 代码实现
/**
 * 
 * @author Mr_Shang
 * 
 * @version 1.0
 *
 */
public class InstaIdGenerator {

 protected static final Logger LOG = LoggerFactory.getLogger(IdWorker.class);

 /**
  * 时间戳的位数,实际占41位,最高位保持为0,保证long值为正数
  */
 private int timestampBitCount = 42;

 /**
  * 逻辑分片位数
  */
 private int regionBitCount = 10;

 /**
  * 逻辑分片的最大数量
  */
 private int regionModelVal = 1 << regionBitCount;

 /**
  * 序列位数
  */
 private int sequenceBitCount = 12;

 /**
  * 总的位数
  */
 private int totalBitCount = timestampBitCount + regionBitCount + sequenceBitCount;

 /**
  * 当前序列值
  */
 private long sequence = 0;

 /**
  * 最后一次请求时间戳
  */
 private long lastTimestamp = -1L;

 /**
  * 序列的位板
  */
 private long sequenceMask = -1L ^ (-1L << sequenceBitCount);
 
 /**
  * 最后一次请求用户标识
  */
 private long lastTag=1L;

 public InstaIdGenerator() {

 }

 public InstaIdGenerator(long seq) {
  if (seq < 0) {
   seq = 0;
  }
  this.sequence = seq;
 }

 public synchronized long nextId(long tag) {
  long timestamp = timeGen();

  if(tag<0){
   tag=-tag;
  }
  
  if (timestamp < lastTimestamp) {
   LOG.error(String.format("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp));
   throw new RuntimeException(String.format(
     "Clock moved backwards.  Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
  }

  if (lastTimestamp == timestamp) {
   sequence = (sequence + 1) & sequenceMask;
   if (sequence == 0) {
    timestamp = tilNextMillis(lastTimestamp);
   }
  } else {
   sequence = 0L;
  }
  
  if(tag==lastTag){
   sequence = (sequence + 1) & sequenceMask;
   if (sequence == 0) {
    timestamp = tilNextMillis(lastTimestamp);
   }
  }
  lastTag=tag;
  
  lastTimestamp = timestamp;

  return (timestamp << (totalBitCount - timestampBitCount))
    | ((tag % regionModelVal) << (totalBitCount - timestampBitCount - regionBitCount)) | sequence;
 }
}

完整源码地址

项目已经托管到github,飞机在这里☞ 数值型id生成器

参考内容

  • Snowflake的Java实现
  • Twitter Snowflake
  • [Instagram 的ID生成策略[翻译]

你可能感兴趣的:(Java实现数值型ID生成器)