Region Splitting 过程

官方文档:http://hbase.apache.org/book.html#regionserver_splitting_implementation

Region Splitting 过程_第1张图片

  1. The RegionServer decides locally to split the region, and prepares the split. THE SPLIT TRANSACTION IS STARTED. As a first step, the RegionServer acquires a shared read lock on the table to prevent schema modifications during the splitting process. Then it creates a znode in zookeeper under /hbase/region-in-transition/region-name, and sets the znode’s state to SPLITTING.

    RegionServer决定本地分割region,THE SPLIT TRANSACTION IS STARTED。RegionServer首先在要进行region split的table上获得一个共享的读文件锁,避免table的schema在region分割期间被修改。然后在Zookeeper中创建一个znode,/hbase/region-in-transition/region-name,并将该znode状态设置为SPLITTING。

  2. The Master learns about this znode, since it has a watcher for the parent region-in-transition znode.

    Master通过监视/hbase/region-in-transition这个znode获得/hbase/region-in-transition/region-name的状态信息。

  3. The RegionServer creates a sub-directory named .splits under the parent’s region directory in HDFS.

    RegionServer在HDFS中parent的目录中创建.splits子目录。

  4. The RegionServer closes the parent region and marks the region as offline in its local data structures. THE SPLITTING REGION IS NOW OFFLINE. At this point, client requests coming to the parent region will throw NotServingRegionException. The client will retry with some backoff. The closing region is flushed.

    RegionServer关闭parent region并且标记该region下线。在当前时刻,客户端对parent region的请求会抛出NotServingRegionException。

  5. The RegionServer creates region directories under the .splits directory, for daughter regions A and B, and creates necessary data structures. Then it splits the store files, in the sense that it creates two Reference files per store file in the parent region. Those reference files will point to the parent regions’files.

    RegionServer在.splits目录下创建子regions A和B以及一些需要的数据结构。然后开始分割store files,为分割后的每个store files创建一个reference文件并存储在parent region中,这些reference文件会指向parent的region file。

  6. The RegionServer creates the actual region directory in HDFS, and moves the reference files for each daughter.

    RegionServer 在HDFS中实际创建子region的目录并且将reference文件移动到子region中。

  7. The RegionServer sends a Put request to the .META. table, to set the parent as offline in the .META. table and add information about daughter regions. At this point, there won’t be individual entries in .META. for the daughters. Clients will see that the parent region is split if they scan .META., but won’t know about the daughters until they appear in .META.. Also, if this Put to .META. succeeds, the parent will be effectively split. If the RegionServer fails before this RPC succeeds, Master and the next Region Server opening the region will clean dirty state about the region split. After the .META. update, though, the region split will be rolled-forward by Master.

    RegionServer发送Put请求到.META.表,将parent region在.META.表中的状态设置为offline,并将daughter regions的信息插入到.META.表中。如果用户scan .META.表的话可以看到parent region已经在分割,但是在daughter region的信息被插入.META.表之前无法查看到。如果该Put操作成功的话,parent region就会被有效的分割。如果RegionServer在RPC成功之前失效的话,Master和下一个打开该region的RegionServer会清理region split过程中产生的临时状态,.META.表更新之后,region 的split会由Master进行回滚。

  8. The RegionServer opens daughters A and B in parallel.

    RegionServer并行打开A和B这两个region。

  9. The RegionServer adds the daughters A and B to .META., together with information that it hosts the regions. THE SPLIT REGIONS (DAUGHTERS WITH REFERENCES TO PARENT) ARE NOW ONLINE. After this point, clients can discover the new regions and issue requests to them. Clients cache the .META. entries locally, but when they make requests to the RegionServer or .META., their caches will be invalidated, and they will learn about the new regions from .META..

    RegionServer将daughter A和daughter B所在的主机信息插入到.META.表中,并且状态被标记为online。在这个时间过后,客户端就可以发现新region的信息并且可以对其发送请求。客户端会将.META.实例缓存到本地,但是当客户端向RegionServer或者.META.发送请求的时候缓存失效,这个时候会重新向.META.获得新region的位置信息。

  10. The RegionServer updates znode /hbase/region-in-transition/region-name in ZooKeeper to state SPLIT, so that the master can learn about it. The balancer can freely re-assign the daughter regions to other region servers if necessary. THE SPLIT TRANSACTION IS NOW FINISHED.

    RegionServer更新Zookeeper中/hbase/region-in-transition/region-name znode的状态为SPLIT,master节点会通过监视器获得该znode的状态。如果有必要的话balancer会将daughter region重新标记到其他regionserver节点。THE SPLIT TRANSACTION IS NOW FINISHED.

  11. After the split, .META. and HDFS will still contain references to the parent region. Those references will be removed when compactions in daughter regions rewrite the data files. Garbage collection tasks in the master periodically check whether the daughter regions still refer to the parent region’s files. If not, the parent region will be removed.

    在分割之后,.META.表中和HDFS中仍然会在parent region中保存reference文件,这些reference文件会在daughter region进行compactionca操作重写数据的时候被删除。master的垃圾回收任务会周期性的检查daughter region是否还引用parent region,如果没有,这些parent region会被移除。

你可能感兴趣的:(hbase,split,region)