Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.
对于一个较大的订购业务场景,我们往往需要能够生成一个全局的唯一的订单号,如何在多个集群,多个节点高效生成唯一订单号?我们参考了Twitter的snowflake算法。
snowflake最初由Twitter开发,用的scala,对于Twitter而言,必须满足每秒上万条消息的请求,并且每条消息能够分配一个全局唯一的ID,因此,ID生成服务要求必须满足高性能(>10K ids/s)、低延迟(<2ms)、高可用的特性,同时生成的ID还可以进行大致的排序,以方便客户端的排序。
Snowflake满足了以上的需求。Snowflake生成的每一个ID都是64位的整型数,它的核心算法也比较简单高效,结构如下:
41位的时间序列,精确到毫秒级,41位的长度可以使用69年。时间位还有一个很重要的作用是可以根据时间进行排序。
10位的机器标识,10位的长度最多支持部署1024个节点。
12位的计数序列号,序列号即一系列的自增id,可以支持同一节点同一毫秒生成多个ID序号,12位的计数序列号支持每个节点每毫秒产生4096个ID序号。
最高位是符号位,始终为0,不可用。
/**
* 摘自网上某blog,记不得地址了。。
* @Project concurrency
* Created by wgy on 16/7/19.
*/
public class IdGen {
private long workerId;
private long datacenterId;
private long sequence = 0L;
private long twepoch = 1288834974657L; //Thu, 04 Nov 2010 01:42:54 GMT
private long workerIdBits = 5L; //节点ID长度
private long datacenterIdBits = 5L; //数据中心ID长度
private long maxWorkerId = -1L ^ (-1L << workerIdBits); //最大支持机器节点数0~31,一共32个
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); //最大支持数据中心节点数0~31,一共32个
private long sequenceBits = 12L; //序列号12位
private long workerIdShift = sequenceBits; //机器节点左移12位
private long datacenterIdShift = sequenceBits + workerIdBits; //数据中心节点左移17位
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; //时间毫秒数左移22位
private long sequenceMask = -1L ^ (-1L << sequenceBits); //4095
private long lastTimestamp = -1L;
private static class IdGenHolder {
private static final IdGen instance = new IdGen();
}
public static IdGen get(){
return IdGenHolder.instance;
}
public IdGen() {
this(0L, 0L);
}
public IdGen(long workerId, long datacenterId) {
if (workerId > maxWorkerId || workerId < 0) {
throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
}
if (datacenterId > maxDatacenterId || datacenterId < 0) {
throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
}
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public synchronized long nextId() {
long timestamp = timeGen(); //获取当前毫秒数
//如果服务器时间有问题(时钟后退) 报错。
if (timestamp < lastTimestamp) {
throw new RuntimeException(String.format(
"Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
//如果上次生成时间和当前时间相同,在同一毫秒内
if (lastTimestamp == timestamp) {
//sequence自增,因为sequence只有12bit,所以和sequenceMask相与一下,去掉高位
sequence = (sequence + 1) & sequenceMask;
//判断是否溢出,也就是每毫秒内超过4095,当为4096时,与sequenceMask相与,sequence就等于0
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp); //自旋等待到下一毫秒
}
} else {
sequence = 0L; //如果和上次生成时间不同,重置sequence,就是下一毫秒开始,sequence计数重新从0开始累加
}
lastTimestamp = timestamp;
// 最后按照规则拼出ID。
// 000000000000000000000000000000000000000000 00000 00000 000000000000
// time datacenterId workerId sequence
return ((timestamp - twepoch) << timestampLeftShift) | (datacenterId << datacenterIdShift)
| (workerId << workerIdShift) | sequence;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
}
注释已经写的比较详细了,不做特别的说明。
对于订购业务而言,虽然可以记录订单的创建时间,但是一般都需要带有显示的时间戳属性。因此,一个long型已无法满足实际的需求,将输出修改为String类型,前17位用于存储yyyyMMddHHMMssSSS格式的时间,后面用于记录所在集群,节点,以及自增量。
import org.apache.commons.lang.time.DateFormatUtils;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.Date;
/**
* 与snowflake算法区别,返回字符串id,占用更多字节,但直观从id中看出生成时间
*
* @Project concurrency
* Created by wgy on 16/7/19.
*/
public enum IdGenerator {
INSTANCE;
private long workerId; //用ip地址最后几个字节标示
private long datacenterId = 0L; //可配置在properties中,启动时加载,此处默认先写成0
private long sequence = 0L;
private long workerIdBits = 8L; //节点ID长度
private long datacenterIdBits = 2L; //数据中心ID长度,可根据时间情况设定位数
private long sequenceBits = 12L; //序列号12位
private long workerIdShift = sequenceBits; //机器节点左移12位
private long datacenterIdShift = sequenceBits + workerIdBits; //数据中心节点左移14位
private long sequenceMask = -1L ^ (-1L << sequenceBits); //4095
private long lastTimestamp = -1L;
IdGenerator(){
workerId = 0x000000FF & getLastIP();
}
public synchronized String nextId() {
long timestamp = timeGen(); //获取当前毫秒数
//如果服务器时间有问题(时钟后退) 报错。
if (timestamp < lastTimestamp) {
throw new RuntimeException(String.format(
"Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
//如果上次生成时间和当前时间相同,在同一毫秒内
if (lastTimestamp == timestamp) {
//sequence自增,因为sequence只有12bit,所以和sequenceMask相与一下,去掉高位
sequence = (sequence + 1) & sequenceMask;
//判断是否溢出,也就是每毫秒内超过4095,当为4096时,与sequenceMask相与,sequence就等于0
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp); //自旋等待到下一毫秒
}
} else {
sequence = 0L; //如果和上次生成时间不同,重置sequence,就是下一毫秒开始,sequence计数重新从0开始累加
}
lastTimestamp = timestamp;
long suffix = (datacenterId << datacenterIdShift) | (workerId << workerIdShift) | sequence;
String datePrefix = DateFormatUtils.format(timestamp, "yyyyMMddHHMMssSSS");
return datePrefix + suffix;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
private byte getLastIP(){
byte lastip = 0;
try{
InetAddress ip = InetAddress.getLocalHost();
byte[] ipByte = ip.getAddress();
lastip = ipByte[ipByte.length - 1];
} catch (UnknownHostException e) {
e.printStackTrace();
}
return lastip;
}
}
10个线程,每个线程生成5w个
需2000ms左右,测试代码如下:
@Test
public void testNextId() throws Exception {
final IdGenerator idg = IdGenerator.INSTANCE;
ExecutorService es = Executors.newFixedThreadPool(10);
final HashSet idSet = new HashSet();
Collections.synchronizedCollection(idSet);
long start = System.currentTimeMillis();
System.out.println("***** start generate id ******");
for (int i = 0; i < 10; i++)
es.execute(new Runnable() {
public void run() {
for (int j = 0; j < 50000; j++) {
String id= idg.nextId();
synchronized (idSet){
idSet.add(id);
}
}
}
});
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
long end = System.currentTimeMillis();
System.out.println("***** end generate id *****");
System.out.println("***** cost " + (end-start) + " ms!");
Assert.assertEquals(10 * 50000, idSet.size());
}
***** start generate id ******
***** end generate id ******
***** cost 2091 ms!