hbase配置修改:

(split是因为hfile过多,进行split,split之后进行compact  

可以可能要有人喷了,hfile多了应该compact才对啦。贴出0.98.1的代码,大致逻辑是region没有block的compact(优先级大于等于1的),则进行split)

private boolean flushRegion(final FlushRegionEntry fqe) {    HRegion region = fqe.region;    if (!region.getRegionInfo().isMetaRegion() &&        isTooManyStoreFiles(region)) {//这个函数使用了参数      if (fqe.isMaximumWait(this.blockingWaitTime)) {        LOG.info("Waited " + (System.currentTimeMillis() - fqe.createTime) +          "ms on a compaction to clean up 'too many store files'; waited " +          "long enough... proceeding with flush of " +          region.getRegionNameAsString());      } else {        // If this is first time we've been put off, then emit a log message.        if (fqe.getRequeueCount() <= 0) {          // Note: We don't impose blockingStoreFiles constraint on meta regions          LOG.warn("Region " + region.getRegionNameAsString() + " has too many " +            "store files; delaying flush up to " + this.blockingWaitTime + "ms");          if (!this.server.compactSplitThread.requestSplit(region)) {//这里是关键的逻辑,逻辑是region没有block的compact(优先级大于等于1的),则进行split;否则进行compact            try {              this.server.compactSplitThread.requestSystemCompaction(                  region, Thread.currentThread().getName());            } catch (IOException e) {              LOG.error(                "Cache flush failed for region " + Bytes.toStringBinary(region.getRegionName()),                RemoteExceptionHandler.checkIOException(e));            }          }        }        // Put back on the queue.  Have it come back out of the queue        // after a delay of this.blockingWaitTime / 100 ms.        this.flushQueue.add(fqe.requeue(this.blockingWaitTime / 100));        // Tell a lie, it's not flushed but it's ok        return true;      }    }    return flushRegion(region, false);  }


hbase.hstore.blockingStoreFiles hfile数量上限,如果超过,则进行阻塞写,进行split | compact

hbase.hstore.blockingWaitTime 阻塞写的时间上限 ,到时间没进行split或compact(就是没锁上,则继续)


最大region 500G,禁止常规的split情况

 

 hbase.hregion.max.filesize

 536870912000

 


 一个store中30个hfile的上限

 

hbase.hstore.blockingStoreFiles

30

 


一分半的写的阻塞上限


hbase.hstore.blockingWaitTime

90000



hbase.regionserver.regionSplitLimit region包含的最大region数, split需要检查现有region不大于这个compact Priority逻辑


 初始化为int.minvalue,user为1,被block>1


 -----------------------------------------------


  DEBUG [LruStats #0] hfile.LruBlockCache: Total=11.78 GB, free=1.01 GB, max=12.79 GB,

 memcache设置256

 memcache使用mslb


使用mslb


hbase.hregion.memstore.mslab.enabled

true


memcash的flush的条件256M

hbase.hregion.memstore.flush.size

268435456

 

安全检查memstore使用region_heap的百分比 , 强制flush

 

base.regionserver.global.memstore.lowerLimit 

 

    hbase.regionserver.global.memstore.lowerLimit 

    0.36 

     

        一个RS中所有的memstore的总容量超过堆的该百分比限制后,将被强制flush到磁盘。 

        Maximum size of all memstores in a region server before flushes are forced. Defaults to 35% of heap. 这个值与 

        hbase.regionserver.global.memstore.upperLimit相等,以减小由于到达该值触发flush的几率,因为这种flush会block写请求 

     

 


安全检查memstore使用region_heap的百分比 , 强制flush,并阻塞写请求

 

    hbase.regionserver.global.memstore.upperLimit 

    0.4 

     

        一个region中所有memstore的总大小超过堆的该百分比限制时,会发生强制flush,并block更新请求。 

        默认是堆大小的40%。更新会被阻塞,并发生强制flush,直到所有memstore的大小达到 

        hbase.regionserver.global.memstore.lowerLimit的限制。 

     

 


达到flushsize指定倍数时,会强制flush,并阻塞请求

 

    hbase.hregion.memstore.block.multiplier 

    2 

     

        当一个region的memstore达到hbase.hregion.memstore.block.multiplier * hbase.hregion.flush.size的指定倍数时,阻塞写请求。 

        这是一个避免在写请求高峰时期避免memstore耗尽的有效设置。如果没有上限限制,memstore被填满后发生flush时, 

        会消耗大量的时间来处理合并和分割,甚至导致OOM。 

     

 


----------------------------------------------------


 gc的问题

  3.5分钟挂掉

  11 分钟

 (70)提前gc,减少每个gc耗时

 hbase-env.sh中

 export HBASE_REGIONSERVER_OPTS="-Xmx16g -Xms16g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log"


 ----------------------------------------------------


 compact时间是否过长??

 compact的时候gc过长

2015-02-25 11:54:50,670 WARN  [regionserver60020.periodicFlusher] util.Sleeper: We slept 565427ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad, se


e http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired


2015-02-25 11:54:50,670 WARN  [DataStreamer for file /hbase/WALs/host64,60020,1422073827259/host64%2C60020%2C1422073827259.1424835178059 block BP-1540478979-192.168.5.117-1409220943611:blk_1097821

214_24084365] hdfs.DFSClient: Error Recovery for block BP-1540478979-192.168.5.117-1409220943611:blk_1097821214_24084365 in pipeline 192.168.5.64:50010, 192.168.5.95:50010: bad datanode 192.168.5.

64:50010

2015-02-25 11:54:50,670 WARN  [regionserver60020.compactionChecker] util.Sleeper: We slept 565427ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad,

see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-02-25 11:54:50,670 INFO  [regionserver60020-SendThread(host141:42181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 577669ms for sessionid 0x44add78c8664fdb,

closing socket connection and attempting reconnect


 转为手动compact,需要逐步手动compact

 

hbase.hregion.majorcompaction

0

 


------------------------------------------------------


regionserver里的handler数量 50

hbase.regionserver.handler.count

50

hbase-site.xml


--------------------------------------------------


wal大小,影响memcash flush

当前hbase.regionserver.hlog.blocksize * hbase.regionserver.maxlogs   128*32=4G

但是hbase.regionserver.global.memstore.lowerLimit * HBASE_HEAPSIZE   0.38*32=12.16G

    hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE   0.4*32=12.8

    注意:确保hbase.regionserver.hlog.blocksize * hbase.regionserver.maxlogs 比hbase.regionserver.global.memstore.lowerLimit * HBASE_HEAPSIZE的值只高那么一点点。. 

改为

hbase.regionserver.maxlogs

104


hbase.regionserver.global.memstore.lowerLimit

0.36



128*105=13G

0.36*32=11.52G

0.4*32=12.8G

原则是让memflush不阻塞,禁止因为wal触发的flush,wal会进行多region flush,并且阻塞,这是最坏的情况


---------------------------------------------------


blockcache是读取时使用内存


hfile.block.cache.size

0.4


----------------------------------------------------


超时时间待验证,设置或过长

       hbase.rowlock.wait.duration

       90000

        

        每次获取行锁的超时时间,默认为30s

        

hbase.regionserver.lease.period

180000

 

客户端每次获得rs一次socket时间

 


       hbase.rpc.timeout

       180000

 

rpc超时时间

 


       hbase.client.scanner.timeout.period

       180000

 

客户端每次scan|get的超时时间

 


        hbase.client.scanner.caching

        100

 

客户端每次scan的一个next,获得多少行,默认1