RegionServer不停新建空的hlog

        线上0.90.2集群发现一个诡异现象。regionserver不停的新建空Hlog,导致Hlog数量到达100w之多,测试集群的hdfs承受不了压力,挂掉。对相关表做了disable和enable之后才恢复正常。从现象来看应该是写操作引起的,但是为什么会产生这个现象,还待追查。

        问题已找到,hbase在写新的hlog之前会检查创建文件的副本数,如果hdfs上只有小于3个(可设置)的文件副本,那么就会判定hlog失败,重试去创建新的hlog。导致不断创建空的hlog。相关代码如下

/**
   * 
   * @return true if log roll requested
   */
  private boolean checkLowReplication() {
    // if the number of replicas in HDFS has fallen below the initial
    // value, then roll logs.
    try {
      int numCurrentReplicas = getLogReplication();
      if (numCurrentReplicas != 0 &&
          numCurrentReplicas < this.initialReplication) {
        LOG.warn("HDFS pipeline error detected. " +
            "Found " + numCurrentReplicas + " replicas but expecting " +
            this.initialReplication + " replicas. " +
            " Requesting close of hlog.");
        requestLogRoll();
        logRollRequested = true;
        return true;
      }
    } catch (Exception e) {
      LOG.warn("Unable to invoke DFSOutputStream.getNumCurrentReplicas" + e +
          " still proceeding ahead...");
    }
    return false;
  }

你可能感兴趣的:(RegionServer不停新建空的hlog)