最简单的实现服务高可用的方法就是集群化,也就是分布式部署,但是分布式部署会带来一些问题。比如:
1、各个实例之间的协同(锁)
2、负载均衡
3、热删除
这里通过一个简单的实例来说明如何解决注册发现和负载均衡。
1、先解决依赖,这里只给出zk相关的依赖,pom.xml如下
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>3.4.8</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.9.1</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>2.9.1</version>
</dependency>
2、ZkClient
这里使用的是curator,curator是对zookeeper的简单封装,提供了一些集成的方法,或者是提供了更优雅的api,举例来说
zk的create(path, mode, acl, data)方法 == curator create().withMode(mode).forPath(path)调用链
import org.apache.curator.RetryPolicy;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.retry.ExponentialBackoffRetry;
import org.apache.zookeeper.CreateMode;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.net.InetAddress;
import java.util.ArrayList;
import java.util.List;
public class ZkClient {
private final Logger logger = LoggerFactory.getLogger(this.getClass());
private CuratorFramework client;
private String zookeeperServer;
private int sessionTimeoutMs;
private int connectionTimeoutMs;
private int baseSleepTimeMs;
private int maxRetries;
public void setZookeeperServer(String zookeeperServer) {
this.zookeeperServer = zookeeperServer;
}
public String getZookeeperServer() {
return zookeeperServer;
}
public void setSessionTimeoutMs(int sessionTimeoutMs) {
this.sessionTimeoutMs = sessionTimeoutMs;
}
public int getSessionTimeoutMs() {
return sessionTimeoutMs;
}
public void setConnectionTimeoutMs(int connectionTimeoutMs) {
this.connectionTimeoutMs = connectionTimeoutMs;
}
public int getConnectionTimeoutMs() {
return connectionTimeoutMs;
}
public void setBaseSleepTimeMs(int baseSleepTimeMs) {
this.baseSleepTimeMs = baseSleepTimeMs;
}
public int getBaseSleepTimeMs() {
return baseSleepTimeMs;
}
public void setMaxRetries(int maxRetries) {
this.maxRetries = maxRetries;
}
public int getMaxRetries() {
return maxRetries;
}
public void init() {
RetryPolicy retryPolicy = new ExponentialBackoffRetry(baseSleepTimeMs, maxRetries);
client = CuratorFrameworkFactory.builder().connectString(zookeeperServer).retryPolicy(retryPolicy)
.sessionTimeoutMs(sessionTimeoutMs).connectionTimeoutMs(connectionTimeoutMs).build();
client.start();
}
public void stop() {
client.close();
}
public CuratorFramework getClient() {
return client;
}
public void register() {
try {
String rootPath = "/" + "services";
String hostAddress = InetAddress.getLocalHost().getHostAddress();
String serviceInstance = "prometheus" + "-" + hostAddress + "-";
client.create().creatingParentsIfNeeded().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(rootPath + "/" + serviceInstance);
} catch (Exception e) {
logger.error("注册出错", e);
}
}
public List<String> getChildren(String path) {
List<String> childrenList = new ArrayList<>();
try {
childrenList = client.getChildren().forPath(path);
} catch (Exception e) {
logger.error("获取子节点出错", e);
}
return childrenList;
}
public int getChildrenCount(String path) {
return getChildren(path).size();
}
public List<String> getInstances() {
return getChildren("/services");
}
public int getInstancesCount() {
return getInstances().size();
}
}
2、configuration如下
import com.dqa.prometheus.client.zookeeper.ZkClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ZkConfiguration {
@Value("${zookeeper.server}")
private String zookeeperServer;
@Value(("${zookeeper.sessionTimeoutMs}"))
private int sessionTimeoutMs;
@Value("${zookeeper.connectionTimeoutMs}")
private int connectionTimeoutMs;
@Value("${zookeeper.maxRetries}")
private int maxRetries;
@Value("${zookeeper.baseSleepTimeMs}")
private int baseSleepTimeMs;
@Bean(initMethod = "init", destroyMethod = "stop")
public ZkClient zkClient() {
ZkClient zkClient = new ZkClient();
zkClient.setZookeeperServer(zookeeperServer);
zkClient.setSessionTimeoutMs(sessionTimeoutMs);
zkClient.setConnectionTimeoutMs(connectionTimeoutMs);
zkClient.setMaxRetries(maxRetries);
zkClient.setBaseSleepTimeMs(baseSleepTimeMs);
return zkClient;
}
}
配置文件如下
#============== zookeeper ===================
zookeeper.server=10.93.21.21:2181,10.93.18.34:2181,10.93.18.35:2181
zookeeper.sessionTimeoutMs=6000
zookeeper.connectionTimeoutMs=6000
zookeeper.maxRetries=3
zookeeper.baseSleepTimeMs=1000
3、注册发现
是通过上面封装的ZkClient中的register方法实现的,调用如下。
import com.dqa.prometheus.client.zookeeper.ZkClient;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.orm.jpa.EntityScan;
import org.springframework.context.ApplicationContext;
import org.springframework.scheduling.annotation.EnableAsync;
import org.springframework.scheduling.annotation.EnableScheduling;
@SpringBootApplication
@EnableAsync
@EnableScheduling
@EntityScan(basePackages="com.xiaoju.dqa.prometheus.model")
public class Application {
public static void main(String[] args) {
ApplicationContext context = SpringApplication.run(Application.class, args);
ZkClient zkClient = context.getBean(ZkClient.class);
zkClient.register();
}
}
1、zk中的注册路径
/services/prometheus-10.93.21.21-00000000001
2、CreateMode有四种,选择EPHEMERAL_SEQUENTIAL的原因是,服务关闭的时候session超时,zk节点会自动删除,同时自增id可以实现锁和负载均衡,下面再说
1、PERSISTENT
持久化目录节点,存储的数据不会丢失。
2、PERSISTENT_SEQUENTIAL
顺序自动编号的持久化目录节点,存储的数据不会丢失,并且根据当前已近存在的节点数自动加 1,然后返回给客户端已经成功创建的目录节点名。
3、EPHEMERAL
临时目录节点,一旦创建这个节点的客户端与服务器端口也就是session 超时,这种节点会被自动删除。
4、EPHEMERAL_SEQUENTIAL
临时自动编号节点,一旦创建这个节点的客户端与服务器端口也就是session 超时,这种节点会被自动删除,并且根据当前已近存在的节点数自动加 1,然后返回给客户端已经成功创建的目录节点名。
复制代码
4、负载均衡
/*
* 我是第几个实例, 做负载均衡
* */
List<String> instanceList = zkClient.getInstances();
Collections.sort(instanceList);
String hostAddress = NetFunction.getAddressHost();
int instanceNo = 0;
if (hostAddress != null) {
for (int i=0; i<instanceList.size(); i++) {
if (instanceList.get(i).split("-")[1].equals(hostAddress)) {
instanceNo = i;
}
}
} else {
logger.info("获取本地IP失败");
}
logger.info("[分发] 实例总数={}, 我是第{}个实例", instanceCount, instanceNo);
List<CheckTask> waitingTasks = checkTaskDao.getTasks(taskType, TaskStatus.WAITING.getValue());
Iterator<CheckTask> waitingIterator = waitingTasks.iterator();
while (waitingIterator.hasNext()) {
if (waitingIterator.next().getTaskId().hashCode() % instanceCount != instanceNo) {
waitingIterator.remove();
}
}
说明:
1、例如有3个实例(zkClient.getInstances()),那么通过IP我们把3个实例按照自增id排序分别标号为0,1,2
2、对第一个实例也就是instanceNo=0,只执行taskId.hashCode() % 3 == 0的任务,其他两个实例类似
3、当有一个实例挂掉,2个实例,instanceNo=0只执行taskId.hashCode() % 2 == 0的任务,实现热删除