HBase-RegionServer架构

 

RegionServer的整体结构

一个region server包含了五部分功能:

1.和zookeeper相关的线程

    MasterAddressTracker负责捕获master节点的状态

    ClusterStatusTracker追踪hbase集群的状态

    CatalogTracker跟踪root表meta表和region的状态

    SplitlogWorker竞争获取znode上的splitlog,并切分HLog按照region分组,放到相应region

        的recovered.edits目录下

2.region相关的线程

    regionserver包含了一个region的集合,每个具体的操作会分到一个指定的region去处理

    CompactionChecker用于周期性的检查是否需要compact,如需要交给CompactSplitThread处理

    CompactSplitThread用于合并和切分处理的线程

    MemStoreFlusher如果memstore满了则flush到HDFS中

3.WAL相关

    HLog按照hbase的架构,一个regionserver只有一个hlog,多个region是共享的

    LogRoller用于日志回滚

4.和客户端通讯

    RPC server模块,这里包含了很多线程,listener,select,handler线程

    Leases 用于租借时间检查

5.和master及监控相关

    HMasterRegionInterface用户管理hbase

    HServerLoad检查hbase负载,并和master通讯

    HealthCheckChore服务的监控检查

    RegionServerMetrics 获取metrics相关的数据

    web server,启用一个jettyserver,可以监控region相关的信息

 

 

 

 

 

RegionServer的相关配置

参数名称 默认值 含义
hbase.client.retries.number 10 客户端的重试次数
hbase.regionserver.msginterval 3000 未知
hbase.regionserver.checksum.verify false

是否启用hbase的

checksum

hbase.server.thread.wakefrequency 10秒 检查线程的频率
hbase.regionserver.numregionstoreport 10 未知
hbase.regionserver.handler.count 10

处理用户表的工作

线程数量

hbase.regionserver.metahandler.count 10

处理meta和root表

的工作线程数量

hbase.rpc.verbose false 未知
hbase.regionserver.nbreservationblocks false 未知

hbase.regionserver.compactionChecker.

majorCompactPriority

max int 未知
hbase.regionserver.executor.openregion.threads 3

开打用户表region

的线程数量

hbase.regionserver.executor.openroot.threads 1

打开root表region

的线程数量

hbase.regionserver.executor.openmeta.threads 1

打开meta表region

的线程数量

hbase.regionserver.executor.closeregion.threads 3

关闭用户表region

的线程数量

hbase.regionserver.executor.closeroot.threads 1

关闭root表region

的线程数量

hbase.regionserver.executor.closemeta.threads 1

关闭meta表region

的线程数量

 

 

 

 

 

HRegionServer的启动入口类

org.apache.hadoop.hbase.regionserver.HRegionServer

hbase-site.xml中可以配置参数 hbase.regionserver.impl来自定自己的实现,但必须继承HRegionServer

 

之后调用HRegionServerCommandLine (这个类继承自ServerCommandLine,所以master也有一个实现)

HRegionServerCommandLine使用hadoop提供的ToolRunner去运行

 

ToolRunner#run(Configuration,Tool,String[])

ToolRunner会调用GenericOptionsParser,解析一些固定的参数,如-conf,-D,-fs,-files 这样的参数

解析好之后,配置configuration对象,然后将启动参数传给Tool接口的实现

所以ToolRunner 就是一个启动参数解析,配置configuration对象的工具类,然后将这些信息交给Tool实现类

 

 

 

 

 

初始化-调用HRgionServer构造函数

HRegionServerCommandLine反射创建HRegionServer(或其自定义子类)

1.这里对客户端连接配置做了一些初始化工作

2.配置host,DNS相关

3.HRegionServer 调用HBaseRPC,创建一个RpcEngine实现,这里是WritableRpcEngine

hbase-site.xml中可以配置参数 hbase.rpc.engine来自定自己的实现,但必须继承RpcEngine接口

HBaseRPC调用 getServer()获得一个具体的RpcServer实现,即通过RpcEngine --> RpcServer

这里获取的是WritableRpcEngine的内部类WritableRpcEngine$Server,它继承了HBaserServer

4.创建metrics线程(for JVM),LRU检查线程

5.连接zookeeper,做一些验证工作(kerbose)

 

 

 

 

 

启动,HRegionServer#run (在新线程中启动)

之后就开始启动server了,启动是从HRegionServer#run()开始的(新启动的线程)

1.创建zookeeper监听线程

2.创建和master通讯的线程

3.创建WAL相关的线程

4.创建metrics线程(for hbase)

5.创建日志回滚线程、cache flush线程、compact线程、心跳检查线程、租借检查线程

6.创建jetty线程

7.创建response线程,listener线程,handle线程,高优先级handle(处理meta表)线程,复制handle线程

8.创建日志切分线程

 

这里还定义了线程池,以后会通过处理请求的时候,可能会开启这些线程:(在第五步的时候定义的)

hbase.regionserver.executor.openregion.threads3(默认)

hbase.regionserver.executor.openroot.threads1

hbase.regionserver.executor.openmeta.threads1

hbase.regionserver.executor.closeregion.threads3

hbase.regionserver.executor.closeroot.threads1

hbase.regionserver.executor.closemeta.threads1

 

日志切分线程在启动的时候可能会有很多事情要做

之后整个region server就启动完成了

 

 

 

 

 

HRegionServer包含一些功能

HRegion集合

Leases(租借时间检查)

HMasterRegionInterface(管理hbase)

HServerLoad(hbase负载)

CompactSplitThread(用于合并处理)

CompactionChecker(周期性的检查是否需要compact,如需要交给CompactSplitThread处理)

MemStoreFlusher(用于刷新memstore)

HLog(WAL相关)

LogRoller(日志回滚)

ZooKeeperWatcher(zk监听)

SplitLogWorker(用于切分日志)

ExecutorService(用户启动open,close HRegion的线程池)

ReplicationSourceService和ReplicationSinkService(replication相关)

HealthCheckChore(健康检查)

RegionServerMetrics(监控)

 

一些监听类

MasterAddressTracker

CatalogTracker

ClusterStatusTracker

 

postOpenDeployTasks 用于更新root表或meta表

各种CURD,scanner,increment操作

multi操作(对于delete和put)

对HRegion的flush,close,open(提交到线程池去做)

split,compact操作,这些最终由一个具体的HRegion去完成

 

 

 

 

 

RegionServer的线程

用于小合并的

Daemon Thread [regionserver60020-smallCompactions-1392958977368] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

PriorityBlockingQueue.take() line: 220

ThreadPoolExecutor.getTask() line: 957

ThreadPoolExecutor$Worker.run() line: 917

Thread.run() line: 662

 

打开用户表region的线程

Thread [RS_OPEN_REGION-myhost,60020,1392868973177-0] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

ExecutorService$TrackingThreadPoolExecutor(ThreadPoolExecutor).getTask() line: 957

ThreadPoolExecutor$Worker.run() line: 917

Thread.run() line: 662

 

这个是跟zookeeper通讯的线程

Daemon Thread [PostOpenDeployTasks:1028785192-EventThread] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

ClientCnxn$EventThread.run() line: 491

 

跟zookeeper通讯的线程

Daemon Thread [PostOpenDeployTasks:1028785192-SendThread(myhost:2181)] (Suspended)

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method]

EPollArrayWrapper.poll(long) line: 210

EPollSelectorImpl.doSelect(long) line: 65

EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69

EPollSelectorImpl(SelectorImpl).select(long) line: 80

ClientCnxnSocketNIO.doTransport(int, List, LinkedList, ClientCnxn) line: 338

ClientCnxn$SendThread.run() line: 1068

 

专门处理META表的线程

Thread [RS_OPEN_META-myhost,60020,1392868973177-0] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

ExecutorService$TrackingThreadPoolExecutor(ThreadPoolExecutor).getTask() line: 957

ThreadPoolExecutor$Worker.run() line: 917

Thread.run() line: 662

 

专门处理ROOT表的线程

Thread [RS_OPEN_ROOT-myhost,60020,1392868973177-0] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

ExecutorService$TrackingThreadPoolExecutor(ThreadPoolExecutor).getTask() line: 957

ThreadPoolExecutor$Worker.run() line: 917

Thread.run() line: 662

 

日志拆分线程

Thread [SplitLogWorker-myhost,60020,1392868973177] (Suspended)

Object.wait(long) line: not available [native method]

Object.wait() line: 485

SplitLogWorker.taskLoop() line: 219

SplitLogWorker.run() line: 179

Thread.run() line: 662

 

jetty的扫描线程

Daemon Thread [Timer-0] (Suspended)

Object.wait(long) line: not available [native method]

TimerThread.mainLoop() line: 509

TimerThread.run() line: 462

 

jetty的接收线程

Thread [1142818380@qtp-1463348369-1 - Acceptor0 [email protected]:60030] (Suspended)

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method]

EPollArrayWrapper.poll(long) line: 210

EPollSelectorImpl.doSelect(long) line: 65

EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69

EPollSelectorImpl(SelectorImpl).select(long) line: 80

SelectorManager$SelectSet.doSelect() line: 498

SelectChannelConnector$1(SelectorManager).doSelect(int) line: 192

SelectChannelConnector.accept(int) line: 124

AbstractConnector$Acceptor.run() line: 708

QueuedThreadPool$PoolThread.run() line: 582

 

jetty的工作线程

Thread [869247333@qtp-1463348369-0] (Suspended)

Object.wait(long) line: not available [native method]

QueuedThreadPool$PoolThread.run() line: 626

 

租借检查相关线程

Daemon Thread [regionserver60020.leaseChecker] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.parkNanos(Object, long) line: 196

AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) line: 2025

DelayQueue.poll(long, TimeUnit) line: 201

Leases.run() line: 83

Thread.run() line: 662

 

检查压缩的线程

Daemon Thread [regionserver60020.compactionChecker] (Suspended)

Object.wait(long) line: not available [native method]

Sleeper.sleep(long) line: 91

HRegionServer$CompactionChecker(Chore).run() line: 75

Thread.run() line: 662

 

检查memstore flush的线程

Daemon Thread [regionserver60020.cacheFlusher] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.parkNanos(Object, long) line: 196

AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) line: 2025

DelayQueue.poll(long, TimeUnit) line: 201

DelayQueue.poll(long, TimeUnit) line: 39

MemStoreFlusher.run() line: 220

Thread.run() line: 662

 

日志回滚的线程

Daemon Thread [regionserver60020.logRoller] (Suspended)

Object.wait(long) line: not available [native method]

LogRoller.run() line: 77

Thread.run() line: 662

 

metrics线程

Daemon Thread [Timer thread for monitoring jvm] (Suspended)

Object.wait(long) line: not available [native method]

TimerThread.mainLoop() line: 509

TimerThread.run() line: 462

 

日志同步线程

Daemon Thread [regionserver60020.logSyncer] (Suspended)

Object.wait(long) line: not available [native method]

HLog$LogSyncer.run() line: 1265

Thread.run() line: 662

 

HDFS客户端租借检查线程

Daemon Thread [LeaseChecker] (Suspended)

Thread.sleep(long) line: not available [native method]

DFSClient$LeaseChecker.run() line: 1379

Daemon(Thread).run() line: 662

 

未知

Thread [regionserver60020] (Suspended)

Object.wait(long) line: not available [native method]

Sleeper.sleep(long) line: 91

Sleeper.sleep() line: 55

HRegionServer.run() line: 787

Thread.run() line: 662

 

LRU相关的线程

Daemon Thread [LRU Statistics #0] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.parkNanos(Object, long) line: 196

AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) line: 2025

DelayQueue.take() line: 164

ScheduledThreadPoolExecutor$DelayedWorkQueue.take() line: 609

ScheduledThreadPoolExecutor$DelayedWorkQueue.take() line: 602

ScheduledThreadPoolExecutor(ThreadPoolExecutor).getTask() line: 957

ThreadPoolExecutor$Worker.run() line: 917

Thread.run() line: 662

 

LRU缓存检查线程

Daemon Thread [main.LruBlockCache.EvictionThread] (Suspended)

Object.wait(long) line: not available [native method]

LruBlockCache$EvictionThread(Object).wait() line: 485

LruBlockCache$EvictionThread.run() line: 612

Thread.run() line: 662

 

RPC监控线程

Daemon Thread [Timer thread for monitoring rpc] (Suspended)

Object.wait(long) line: not available [native method]

TimerThread.mainLoop() line: 509

TimerThread.run() line: 462

 

reader线程

Daemon Thread [IPC Reader 0 on port 60020] (Suspended)

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method]

EPollArrayWrapper.poll(long) line: 210

EPollSelectorImpl.doSelect(long) line: 65

EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69

EPollSelectorImpl(SelectorImpl).select(long) line: 80

EPollSelectorImpl(SelectorImpl).select() line: 84

HBaseServer$Listener$Reader.doRunLoop() line: 528

HBaseServer$Listener$Reader.run() line: 514

ThreadPoolExecutor$Worker.runTask(Runnable) line: 895

ThreadPoolExecutor$Worker.run() line: 918

Thread.run() line: 662

 

工作线程(可以配置多个)    REPL是用于复制的,PRI是用于处理META表的,IPC是普通的工作线程

Daemon Thread [REPL IPC Server handler 0 on 60020] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

HBaseServer$Handler.run() line: 1398

 

Daemon Thread [PRI IPC Server handler 0 on 60020] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

HBaseServer$Handler.run() line: 1398

 

Daemon Thread [IPC Server handler 1 on 60020] (Suspended)

Unsafe.park(boolean, long) line: not available [native method]

LockSupport.park(Object) line: 156

AbstractQueuedSynchronizer$ConditionObject.await() line: 1987

LinkedBlockingQueue.take() line: 399

HBaseServer$Handler.run() line: 1398

 

select线程

Daemon Thread [IPC Server listener on 60020] (Suspended)

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method]

EPollArrayWrapper.poll(long) line: 210

EPollSelectorImpl.doSelect(long) line: 65

EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69

EPollSelectorImpl(SelectorImpl).select(long) line: 80

EPollSelectorImpl(SelectorImpl).select() line: 84

HBaseServer$Listener.run() line: 636

 

响应线程

Daemon Thread [IPC Server Responder] (Suspended)

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method]

EPollArrayWrapper.poll(long) line: 210

EPollSelectorImpl.doSelect(long) line: 65

EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69

EPollSelectorImpl(SelectorImpl).select(long) line: 80

HBaseServer$Responder.doRunLoop() line: 825

HBaseServer$Responder.run() line: 808

 

未知

Daemon Thread [hbase-tablepool-1-thread-1] (Suspended)

Object.wait(long) line: not available [native method]

HBaseClient$Call(Object).wait() line: 485

HBaseClient.call(Writable, InetSocketAddress, Class, User, int) line: 981

WritableRpcEngine$Invoker.invoke(Object, Method, Object[]) line: 86

$Proxy14.multi(MultiAction) line: not available

HConnectionManager$HConnectionImplementation$3$1.call() line: 1427

HConnectionManager$HConnectionImplementation$3$1.call() line: 1425

HConnectionManager$HConnectionImplementation$3$1(ServerCallable).withoutRetries() line: 215

HConnectionManager$HConnectionImplementation$3.call() line: 1434

HConnectionManager$HConnectionImplementation$3.call() line: 1422

FutureTask$Sync.innerRun() line: 303

FutureTask.run() line: 138

ThreadPoolExecutor$Worker.runTask(Runnable) line: 895

ThreadPoolExecutor$Worker.run() line: 918

Thread.run() line: 662

 

总体来说线程有这么一些

1.日志回滚,日志syn同步,日志切分

2.大合并,小合并线程

3.LRU缓存检查,memstore检查

4.HDFS客户端,HDFS客户端超时检查,zookeeper通讯

5.专门处理root表,专门处理meta表

6.jetty线程

7.listener线程,reader线程,handle线程,响应线程,用于处理META的handle线程,用于复制的handle线程

 

 

 

 

 

postOpenDeployTask线程(用于更新META表)

具体逻辑如下:

//PostOpenDeployTask用于更新META表的线程
OpenRegionHandler$PostOpenDeployTasksThread#run() {
	HRegionServer#postOpenDeployTasks();	
}

//首先看是否需要刷新Store中的数据
//之后根据是ROOT表META表还是普通表再做更新
HRegionServer#postOpenDeployTasks() {
	for (Store s : r.getStores().values()) {
		if (s.hasReferences() || s.needsCompaction()) {
			getCompactionRequester().requestCompaction(r, s, "Opening Region", null);
		}
	}
	// Update ZK, ROOT or META
	if (r.getRegionInfo().isRootRegion()) {
		RootLocationEditor.setRootLocation(getZooKeeper(),this.serverNameFromMasterPOV);
	} else if (r.getRegionInfo().isMetaRegion()) {
		MetaEditor.updateMetaLocation(ct, r.getRegionInfo(),this.serverNameFromMasterPOV);
	} else {
		if (daughter) {
			// If daughter of a split, update whole row, not just location.
			MetaEditor.addDaughter(ct, r.getRegionInfo(),
			this.serverNameFromMasterPOV);
		} else {
			MetaEditor.updateRegionLocation(ct, r.getRegionInfo(),
			this.serverNameFromMasterPOV);
		}
	}	
}

//更新ROOT表在ZK中的信息
RootLocationEditor#setRootLocation() {
	ZKUtil.createAndWatch(zookeeper, zookeeper.rootServerZNode,
    Bytes.toBytes(location.toString()));	
}

//更新META表的内容,这里是创建了一个Put对象然后去更新
MetaEditor#updateLocation() {
	Put put = new Put(regionInfo.getRegionName());
    put.add("info", "server",Bytes.toBytes(sn.getHostAndPort()));
    put.add("info", "serverstartcode",Bytes.toBytes(sn.getStartcode()));    
	HTable table = isRootTableRow(row)? getRootHTable(catalogTracker):
      getMetaHTable(catalogTracker);	
	table.put(put);
}

//如果是在做split,则更新这个row的所有KeyValue
//否则就更新server和serverstartcode两个KeyValue即可
MetaEditor#addDaughter() {
	Put put = new Put(regionInfo.getRegionName());
	p.add("info", "regioninfo",Writables.getBytes(hri));
	if (ServerName != null) {
    	put.add("info", "server",Bytes.toBytes(sn.getHostAndPort()));
    	put.add("info", "serverstartcode",Bytes.toBytes(sn.getStartcode()));   		
	}
	putToMetaTable(catalogTracker, put);
}

 

 

 

 

 

leaseChecker线程(执行超时后销毁这些操作)

这个类的作用是当某些执行超时,比如get,scan等,需要释放相应的scan或者行锁等

这里是在异步的线程中执行的

具体逻辑如下:

//租借时间检查,当一些执行操作超时后
//需要释放这些操作
Leases#run() {
	Lease lease = leaseQueue.poll(leaseCheckFrequency, TimeUnit.MILLISECONDS);
	lease.getListener().leaseExpired();
}

//行操作执行尝试则释放行锁
RowLockListener#leaseExpired() {
	Integer r = rowlocks.remove(this.lockName);
	if (r != null) {
		region.releaseRowLock(r);
	}	
}

//当scan执行超时就关闭这个scan
ScannerListener#leaseExpired() {
	RegionScanner s = scanners.remove(this.scannerName);
	HRegion region = getRegion(s.getRegionInfo().getRegionName());
	s.close();	
}

 

 

 

 

 

参考

HBase深入分析之RegionServer

Hbase系统架构及数据结构

HRegionServer启动过程 

 

 

 

 

 

 

你可能感兴趣的:(hadoop)