ZooKeeper是一个针对分布式应用的分布式、开源的协调服务。通过它可以实现更高级别的服务,用于数据同步、配置维护、服务分组和服务命名等。zookeeper设计考虑了易用性和简单性,它使用了一种类似于文件系统目录树结构的数据模型。协调服务容易出现竞争条件和死锁等错误。ZooKeeper背后的动机是为了减轻分布式应用从头开始实现协调服务的责任。
zookeeper设计目标:
简单/可靠/顺序性的/高性能(读)
zookeeper数据模型:
节点分为: 持久节点/持久化顺序节点/临时节点/临时顺序节点, 节点包含节点属性/节点acl/节点quata/节点数据/节点ttl等
疑惑?
curator.inTransaction().create().forPath("/t2").and().create().forPath("/t4").and().create().forPath("/t3") .and().commit(); 实际zk服务器代码接手到请求是经过preRequestProcessor-->SyncRequestProcessor-->FinalRequestProcessor, 在preRequestProcessor中会对事务中所有的操作进行预校验,理论上是可靠的(服务器突然crash不可靠结果不能预知,也许还没进行snapshot操作)
1.在Client向Follwer发出一个写的请求
2.Follwer把请求发送给Leader
3.Leader接收到以后开始发起投票并通知Follwer进行投票
4.Follwer把投票结果发送给Leader
5.Leader将结果汇总后如果需要写入,则开始写入同时把写入操作通知给Follwer,然后commit;
6.Follwer把请求结果返回给Client
一个节点多次注册watcher只会收到一次事件(是收到一次事件)。
curator.getData().usingWatcher(new Watcher() {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDataChanged) {
System.out.println(watchedEvent.hashCode() + ": accept data changed event1!!!");
}
}
}).forPath("/t1");
curator.getData().usingWatcher(new Watcher() {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDataChanged) {
System.out.println(watchedEvent.hashCode() + ": accept data changed event2!!!");
}
}
}).forPath("/t1");
事件冒泡通知( 比如:删除/t1/t11, /t1会收到删除子节点事件,但是/节点不会收到消息)。
curator.getData().usingWatcher(new Watcher() {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDeleted) {
System.out.println(watchedEvent.hashCode() + ": /t1/t11 node deleted!!!");
}
}
}).forPath("/t1/t11");
curator.getChildren().usingWatcher(new Watcher() {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeChildrenChanged) {
System.out.println(watchedEvent.hashCode() + ": /t1 children changed !!!");
}
}
}).forPath("/t1");
(1) session超时
session超时后才删除临时节点。
(2) 连接超时
//注册zk连接监听器
zkUtils.getCuratorFramework().getConnectionStateListenable().addListener(new SessionConnectionListener());
class SessionConnectionListener implements ConnectionStateListener {
@Override
public void stateChanged(CuratorFramework client, ConnectionState newState) {
if (newState == ConnectionState.LOST) {
log.warn("connection lost, start to repiare connection!!!");
while (true) {
try {
if (client.getZookeeperClient().blockUntilConnectedOrTimedOut()) {
log.warn("connection reconnected!!!");
register();
break;
}
} catch (InterruptedException e) {
break;
} catch (Exception e) {
}
}
}
}
}
private SnowFlake getSnowFlake(BusinessType resourceType) {
if (machineId == MACHINE_ID_REBOOT_V) {
throw new RuntimeException("zk id base path is deleted, please fix the issue and reboot the service!!!");
}
if (machineId == MACHINE_ID_RETRY_V) {
log.info("try to re-register machineId");
synchronized (this) {
machineId = getMachineId();
if (machineId == MACHINE_ID_RETRY_V) {
throw new RuntimeException("machineId register fail");
}
}
}
String cacheKey = String.format("%d-%d-%d", this.dataCenterId, machineId, resourceType.getValue());
if (snowFlakeMap.containsKey(cacheKey)) {
return snowFlakeMap.get(cacheKey);
} else {
synchronized (this) {
SnowFlake snowFlake = new SnowFlake(this.dataCenterId, machineId, (long) resourceType.getValue());
snowFlakeMap.put(cacheKey, snowFlake);
return snowFlake;
}
}
}
private long getMachineId() {
try {
zkUtils.getCuratorFramework().getData().usingWatcher(new BasePathWatcher()).forPath(zkIdBasePath);
} catch (Exception e) {
log.error(e.getMessage(), e);
}
String ipAddr = NetUtil.getIp();
if (ipAddr == null) {
log.error("get host ip fail!");
return MACHINE_ID_RETRY_V;
}
for (String invalidIp : invalidIps) {
if (invalidIp.equals(ipAddr)) {
log.error("hostname and ip address is not be configured correctly!!!!");
return MACHINE_ID_RETRY_V;
}
}
String currentNodeZkPath = zkIdBasePath + "/" + ipAddr;
//step1: 尝试读取改节点之前注册的机器id
long machineId = getZkMachineId(ipAddr);
if (machineId != MACHINE_ID_RETRY_V) {
log.info("machineId existed :{}", machineId);
return machineId;
}
//step2:之前未注册过,重新注册一次机器id
try {
String path = zkUtils.getCuratorFramework().create().creatingParentsIfNeeded()
.withMode(CreateMode.PERSISTENT_SEQUENTIAL).forPath(currentNodeZkPath);
zkUtils.getCuratorFramework().getData().usingWatcher(new IdNodePathWatcher()).forPath(path);
machineId = Long.valueOf(path.substring(currentNodeZkPath.length()));
log.info("registered new machineId :{}", machineId);
return machineId;
} catch (Exception e) {
log.error(e.getMessage(), e);
}
return MACHINE_ID_RETRY_V;
}
private class IdNodePathWatcher implements Watcher {
@Override
public void process(WatchedEvent watchedEvent) {
boolean shouldResetMachineId = false;
switch (watchedEvent.getState()) {
case Expired:
case AuthFailed:
case Disconnected:
shouldResetMachineId = true;
case SyncConnected:
if (watchedEvent.getType() == EventType.NodeDeleted) {
shouldResetMachineId = true;
}
default:
}
if (shouldResetMachineId && machineId != MACHINE_ID_REBOOT_V) {
machineId = MACHINE_ID_RETRY_V;
}
try {
zkUtils.getCuratorFramework().getData().usingWatcher(new IdNodePathWatcher())
.forPath(watchedEvent.getPath());
} catch (Exception e) {
log.error(e.getMessage(), e);
}
}
}
private class BasePathWatcher implements Watcher {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDeleted) {
machineId = MACHINE_ID_REBOOT_V;
}
try {
zkUtils.getCuratorFramework().getData().usingWatcher(new BasePathWatcher())
.forPath(watchedEvent.getPath());
} catch (Exception e) {
log.error(e.getMessage(), e);
}
}
}
public long getZkMachineId(String nodePath) {
try {
List childrenPaths = zkUtils.getCuratorFramework().getChildren().forPath(zkIdBasePath);
if (CollectionUtils.isEmpty(childrenPaths)) {
return MACHINE_ID_RETRY_V;
}
for (String currNodePath : childrenPaths) {
if (!currNodePath.startsWith(nodePath)) {
continue;
}
return Long.valueOf(currNodePath.substring(nodePath.length()));
}
} catch (Exception e) {
log.error(e.getMessage(), e);
}
return MACHINE_ID_RETRY_V;
}
@Component
@Slf4j
public class OwnerCoordinator {
@Autowired
private ZkUtils zkUtils;
private byte ownerStatus = 0;//0:pending,1:hold,2:waiting
private final String zkOwnerPath = "/cleaner/match-task-owner";
public boolean eligible() {
if (ownerStatus == 0) {
try {
zkUtils.getCuratorFramework().create().creatingParentsIfNeeded().withMode(CreateMode.EPHEMERAL)
.forPath(zkOwnerPath, NetUtil.getHostname().getBytes());
ownerStatus = 1;
return true;
} catch (Throwable throwable) {
ownerStatus = 2;
try {
zkUtils.getCuratorFramework().getData().usingWatcher(new NodeReleaseWatcher()).forPath(zkOwnerPath);
} catch (Exception e) {
log.error(e.getMessage(), e);
}
return false;
}
} else if (ownerStatus == 1) {
return true;
} else if (ownerStatus == 2) {
return false;
}
return false;
}
class NodeReleaseWatcher implements Watcher {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDeleted
|| watchedEvent.getState() == KeeperState.Disconnected) {
log.info("has node offline!!!!, change this node to pending schedule");
ownerStatus = 0;
}
}
}
}
一共分为6个sharding,2个机器,每个机器承担一部分:
Node1:
Node2:
class RootNodeWatcher implements Watcher {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeChildrenChanged) {
//重新分配节点route,所有节点停止工作(已经在队列的数据继续处理)
List nodePaths = null;
try {
nodePaths = zkUtils.getCuratorFramework().getChildren().forPath(zkNodesRoot);
Map> routesMap = new HashMap<>();
if (!CollectionUtils.isEmpty(nodePaths)) {
for (String nodePath : nodePaths) {
routesMap.put(nodePath, new ArrayList<>());
}
}
int pathIdx = 1;
byte routeIdx = 0;
int avgRoutesCount = shardingNum / routesMap.size();
int appendRoutesCount = shardingNum % routesMap.size();
for (String nodePath : routesMap.keySet()) {
List routes = routesMap.get(nodePath);
for (int i = 0; i < avgRoutesCount; i++) {
routes.add(routeIdx++);
}
if (pathIdx++ <= appendRoutesCount) {
routes.add(routeIdx++);
}
}
for (String nodePath : routesMap.keySet()) {
String routeData = StringUtils.join(routesMap.get(nodePath), ",");
zkUtils.getCuratorFramework().setData()
.forPath(zkNodesRoot + "/" + nodePath, routeData.getBytes());
}
} catch (Exception e) {
log.error(e.getMessage(), e);
}
}
try {
zkUtils.getCuratorFramework().getChildren().usingWatcher(new RootNodeWatcher()).forPath(zkNodesRoot);
} catch (Exception e) {
log.error(e.getMessage(), e);
}
}
}
class NodeWatcher implements Watcher {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == EventType.NodeDataChanged) {
//重新分配节点route,所有节点停止工作(已经在队列的数据继续处理)
try {
byte[] data = zkUtils.getCuratorFramework().getData().usingWatcher(new NodeWatcher())
.forPath(currNodePath);
if (data == null || data.length == 0) {
log.info("update {} route info to empty!", currNodePath);
routes.clear();
return;
}
String routeInfo = new String(data);
String[] routesStr = routeInfo.split(",");
routes.clear();
for (String route : routesStr) {
routes.add(Byte.valueOf(route));
}
log.info("update {} route info to :{}", currNodePath, routeInfo);
} catch (Exception e) {
log.error(e.getMessage(), e);
}
} else {
try {
zkUtils.getCuratorFramework().getData().usingWatcher(new NodeWatcher()).forPath(currNodePath);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}