1 服务器端整体概览图
- ServerCnxnFactory:负责与client之间的网络交互,支持NIO(默认)以及Netty
- SessionTrackerImpl:会话管理器
- DatadirCleanupManager:定期清理存在磁盘上的log文件和snapshot文件
- PreRequestProcessor,SyncRequestProcessor,FinalRequestProcessor:请求处理流程,责任链模式
- LearnerHandler:Leader与Learner之间的交互
- FileTxnSnapLog:存储在磁盘上的日志文件
- DataTree:体现在内存中的存储结构
- Sessions:Session的相关信息存储
2 单机版服务器启动流程
- 执行QuorumPeerMain的main方法,其中先创建一个
QuorumPeerMain
对象 - 调用
initializeAndRun
方法
protected void initializeAndRun(String[] args)
throws ConfigException, IOException
{
QuorumPeerConfig config = new QuorumPeerConfig();
if (args.length == 1) {
config.parse(args[0]);
}
// Start and schedule the the purge task
DatadirCleanupManager purgeMgr = new DatadirCleanupManager(config
.getDataDir(), config.getDataLogDir(), config
.getSnapRetainCount(), config.getPurgeInterval());
purgeMgr.start();
if (args.length == 1 && config.servers.size() > 0) {
runFromConfig(config);
} else {
LOG.warn("Either no config or no quorum defined in config, running "
+ " in standalone mode");
// there is only server in the quorum -- run as standalone
ZooKeeperServerMain.main(args);
}
}
2.1
- args实际上就是zoo.cfg中的配置,如下
2.2
- 创建
DatadirCleanupManager
实例,参数有snapDir,dataLogDir,snapRetainCount(要保存snapshot文件的个数),purgeInterval(定期清理的频率,单位为小时),snapRetainCount
与purgeInterval
在zoo.cfg中均可以配置 - 调用
DatadirCleanupManager
的start
方法,里面主要依赖PurgeTask
,这也是一个线程,其run方法
PurgeTxnLog
的purge
方法:
public static void purge(File dataDir, File snapDir, int num) throws IOException {
// snapshot文件保存的数量小于3,抛异常
if (num < 3) {
throw new IllegalArgumentException(COUNT_ERR_MSG);
}
// 根据dataDir和snapDir创建FileTxnSnapLog
FileTxnSnapLog txnLog = new FileTxnSnapLog(dataDir, snapDir);
// 根据给定数量获取最近的文件
List snaps = txnLog.findNRecentSnapshots(num);
// 获取数目
int numSnaps = snaps.size();
if (numSnaps > 0) {
// 第二个参数是最近的snapshot文件
purgeOlderSnapshots(txnLog, snaps.get(numSnaps - 1));
}
}
找最近n个snapshot文件:
public List findNRecentSnapshots(int n) throws IOException {
List files = Util.sortDataDir(snapDir.listFiles(), SNAPSHOT_FILE_PREFIX, false);
int count = 0;
List list = new ArrayList();
for (File f: files) {
if (count == n)
break;
if (Util.getZxidFromName(f.getName(), SNAPSHOT_FILE_PREFIX) != -1) {
count++;
list.add(f);
}
}
return list;
}
purgeOlderSnapshots
:
static void purgeOlderSnapshots(FileTxnSnapLog txnLog, File snapShot) {
// 从snapshot文件名中获取zxid
final long leastZxidToBeRetain = Util.getZxidFromName(
snapShot.getName(), PREFIX_SNAPSHOT);
/**
* 我们删除名称中带有zxid且小于leastZxidToBeRetain的所有文件。
* 该规则适用于快照文件和日志文件,
* zxid小于X的日志文件可能包含zxid大于X的事务。
* 更准确的说,命名为log.(X-a)的log文件可能包含比snapshot.X文件更新的事务,
* 如果在该间隔中没有其他以zxid开头即(X-a,X]的日志文件
*/
final Set retainedTxnLogs = new HashSet();
//获取快照日志,其中可能包含比给定zxid更新的事务,这些日志需要保留下来
retainedTxnLogs.addAll(Arrays.asList(txnLog.getSnapshotLogs(leastZxidToBeRetain)));
/**
* Finds all candidates for deletion, which are files with a zxid in their name that is less
* than leastZxidToBeRetain. There's an exception to this rule, as noted above.
*/
class MyFileFilter implements FileFilter{
private final String prefix;
MyFileFilter(String prefix){
this.prefix=prefix;
}
public boolean accept(File f){
if(!f.getName().startsWith(prefix + "."))
return false;
if (retainedTxnLogs.contains(f)) {
return false;
}
long fZxid = Util.getZxidFromName(f.getName(), prefix);
if (fZxid >= leastZxidToBeRetain) {
return false;
}
return true;
}
}
// add all non-excluded log files
List files = new ArrayList();
File[] fileArray = txnLog.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG));
if (fileArray != null) {
files.addAll(Arrays.asList(fileArray));
}
// add all non-excluded snapshot files to the deletion list
fileArray = txnLog.getSnapDir().listFiles(new MyFileFilter(PREFIX_SNAPSHOT));
if (fileArray != null) {
files.addAll(Arrays.asList(fileArray));
}
// remove the old files
for(File f: files)
{
final String msg = "Removing file: "+
DateFormat.getDateTimeInstance().format(f.lastModified())+
"\t"+f.getPath();
LOG.info(msg);
System.out.println(msg);
if(!f.delete()){
System.err.println("Failed to remove "+f.getPath());
}
}
}
先获取到那些需要保留的文件,之后再去删除这些不在保留文件之内的文件。
2.3
判断集群是单机启动还是集群启动,集群走runFromConfig(config)
,单机走ZooKeeperServerMain.main(args)
(其实单机版最终走的是ZooKeeperServerMain
的runFromConfig
,)
runFromConfig
方法:
public void runFromConfig(ServerConfig config) throws IOException {
LOG.info("Starting server");
FileTxnSnapLog txnLog = null;
try {
final ZooKeeperServer zkServer = new ZooKeeperServer();
// Registers shutdown handler which will be used to know the
// server error or shutdown state changes.
final CountDownLatch shutdownLatch = new CountDownLatch(1);
zkServer.registerServerShutdownHandler(
new ZooKeeperServerShutdownHandler(shutdownLatch));
// 构建FileTxnSnapLog对象
txnLog = new FileTxnSnapLog(new File(config.dataLogDir), new File(
config.dataDir));
zkServer.setTxnLogFactory(txnLog);
zkServer.setTickTime(config.tickTime);
zkServer.setMinSessionTimeout(config.minSessionTimeout);
zkServer.setMaxSessionTimeout(config.maxSessionTimeout);
// 构建与client之间的网络通信服务组件
// 这里可以通过zookeeper.serverCnxnFactory配置NIO还是Netty
cnxnFactory = ServerCnxnFactory.createFactory();
cnxnFactory.configure(config.getClientPortAddress(),
config.getMaxClientCnxns());
cnxnFactory.startup(zkServer);
// Watch status of ZooKeeper server. It will do a graceful shutdown
// if the server is not running or hits an internal error.
shutdownLatch.await();
shutdown();
cnxnFactory.join();
if (zkServer.canShutdown()) {
zkServer.shutdown(true);
}
} catch (InterruptedException e) {
// warn, but generally this is ok
LOG.warn("Server interrupted", e);
} finally {
if (txnLog != null) {
txnLog.close();
}
}
}
初始化网络服务组件后,cnxnFactory.startup(zkServer);
这里以默认网络服务组件为例NIOServerCnxnFactory
@Override
public void start() {
// ensure thread is started once and only once
if (thread.getState() == Thread.State.NEW) {
thread.start();
}
}
@Override
public void startup(ZooKeeperServer zks) throws IOException,
InterruptedException {
//调用上面的start方法,实际调用thread的start
//也就调用了该类的run方法,启动网络服务
start();
//这是ZooKeeperServer
setZooKeeperServer(zks);
zks.startdata();
zks.startup();
}
2.4
zks.startdata()
方法:
public void startdata()
throws IOException, InterruptedException {
//check to see if zkDb is not null
if (zkDb == null) {
zkDb = new ZKDatabase(this.txnLogFactory);
}
if (!zkDb.isInitialized()) {
//loadData进行初始化
loadData();
}
}
ZKDatabase在内存中维护了zookeeper的sessions, datatree和commit logs集合。 当zookeeper server启动的时候会将txnlogs和snapshots从磁盘读取到内存中
ZKDatabase的loadData()
方法:
//如果zkDb已经初始化,设置zxid(内存当中DataTree最新的zxid)
if(zkDb.isInitialized()){
setZxid(zkDb.getDataTreeLastProcessedZxid());
}
else {
// 没有初始化,就loadDataBase
setZxid(zkDb.loadDataBase());
}
public long loadDataBase() throws IOException {
long zxid = snapLog.restore(dataTree, sessionsWithTimeouts, commitProposalPlaybackListener);
initialized = true;
return zxid;
}
loadDataBase()
内部调用的是FileTxnSnapLog
的restore
方法
2.5
zks.startup()
方法:
public synchronized void startup() {
if (sessionTracker == null) {
// 创建会话管理器
createSessionTracker();
}
// 启动会话管理器
startSessionTracker();
// 设置请求处理器
setupRequestProcessors();
// 注册jmx
registerJMX();
setState(State.RUNNING);
notifyAll();
}