先从入数据开始。
Put方法最终会调用HTable的doPut方法
private void doPut(final List<Put> puts) throws IOException { int n = 0; for (Put put : puts) { validatePut(put); writeBuffer.add(put); currentWriteBufferSize += put.heapSize(); // we need to periodically see if the writebuffer is full instead of waiting until the end of the List n++; if (n % DOPUT_WB_CHECK == 0 && currentWriteBufferSize > writeBufferSize) { flushCommits(); } } if (autoFlush || currentWriteBufferSize > writeBufferSize) { flushCommits(); } }
很容易发现,hbase执行flush的时候有两个触发条件,要么WriteBufferSize大于指定值,要么autoFlush。
下面我们再来看flushCommits 方法
public void flushCommits() throws IOException { try { Object[] results = new Object[writeBuffer.size()]; try { this.connection.processBatch(writeBuffer, tableName, pool, results); } catch (InterruptedException e) { throw new IOException(e); } finally { // mutate list so that it is empty for complete success, or contains // only failed records results are returned in the same order as the // requests in list walk the list backwards, so we can remove from list // without impacting the indexes of earlier members for (int i = results.length - 1; i>=0; i--) { if (results[i] instanceof Result) { // successful Puts are removed from the list here. writeBuffer.remove(i); } } } } finally { if (clearBufferOnFail) { writeBuffer.clear(); currentWriteBufferSize = 0; } else { // the write buffer was adjusted by processBatchOfPuts currentWriteBufferSize = 0; for (Put aPut : writeBuffer) { currentWriteBufferSize += aPut.heapSize(); } } } }
我们会发现提交工作在connection.processBatch里完成。connection.processBatch方法会传入一个Object的数组result ,来存储结果。
并在后面判断,如果结果数组result里有结果的话,就是说提交成功的话,会从writeBuffer(一个存储提交数据的list)里把对应的项删除掉,如果不成功则保留起来。
注意最后的一个finally 。 如果clearBufferOnFail 为true的情况,则不会重复提交响应错误的数据。clearBufferOnFail默认是true的。
在setAutoFlush里有设置。
public void setAutoFlush(boolean autoFlush, boolean clearBufferOnFail) { this.autoFlush = autoFlush; this.clearBufferOnFail = autoFlush || clearBufferOnFail; }
public void setAutoFlush(boolean autoFlush) { setAutoFlush(autoFlush, autoFlush); }
下面再来看比较难的connection.processBatch方法
connection是在HConnectionManager方法里进行实例化的,processBatchCallback 方法里进行真正的提交动作的。
这个方法比较长..........
public <R> void processBatchCallback( List<? extends Row> list, byte[] tableName, ExecutorService pool, Object[] results, Batch.Callback<R> callback) throws IOException, InterruptedException { // results must be the same size as list if (results.length != list.size()) { throw new IllegalArgumentException( "argument results must be the same size as argument list"); } if (list.isEmpty()) { return; } // Keep track of the most recent servers for any given item for better // exceptional reporting. We keep HRegionLocation to save on parsing. // Later below when we use lastServers, we'll pull what we need from // lastServers. HRegionLocation [] lastServers = new HRegionLocation[results.length]; List<Row> workingList = new ArrayList<Row>(list); boolean retry = true; // count that helps presize actions array int actionCount = 0; Throwable singleRowCause = null; for (int tries = 0; tries < numRetries && retry; ++tries) { // sleep first, if this is a retry if (tries >= 1) { long sleepTime = getPauseTime(tries); LOG.debug("Retry " +tries+ ", sleep for " +sleepTime+ "ms!"); Thread.sleep(sleepTime); } // step 1: break up into regionserver-sized chunks and build the data structs Map<HRegionLocation, MultiAction<R>> actionsByServer = new HashMap<HRegionLocation, MultiAction<R>>(); for (int i = 0; i < workingList.size(); i++) { Row row = workingList.get(i); if (row != null) { HRegionLocation loc = locateRegion(tableName, row.getRow(), true); byte[] regionName = loc.getRegionInfo().getRegionName(); MultiAction<R> actions = actionsByServer.get(loc); if (actions == null) { actions = new MultiAction<R>(); actionsByServer.put(loc, actions); } Action<R> action = new Action<R>(row, i); lastServers[i] = loc; actions.add(regionName, action); } } // step 2: make the requests Map<HRegionLocation, Future<MultiResponse>> futures = new HashMap<HRegionLocation, Future<MultiResponse>>( actionsByServer.size()); for (Entry<HRegionLocation, MultiAction<R>> e: actionsByServer.entrySet()) { futures.put(e.getKey(), pool.submit(createCallable(e.getKey(), e.getValue(), tableName))); } // step 3: collect the failures and successes and prepare for retry for (Entry<HRegionLocation, Future<MultiResponse>> responsePerServer : futures.entrySet()) { HRegionLocation loc = responsePerServer.getKey(); try { Future<MultiResponse> future = responsePerServer.getValue(); MultiResponse resp = future.get(); if (resp == null) { // Entire server failed LOG.debug("Failed all for server: " + loc.getHostnamePort() + ", removing from cache"); continue; } for (Entry<byte[], List<Pair<Integer,Object>>> e : resp.getResults().entrySet()) { byte[] regionName = e.getKey(); List<Pair<Integer, Object>> regionResults = e.getValue(); for (Pair<Integer, Object> regionResult : regionResults) { if (regionResult == null) { // if the first/only record is 'null' the entire region failed. LOG.debug("Failures for region: " + Bytes.toStringBinary(regionName) + ", removing from cache"); } else { // Result might be an Exception, including DNRIOE results[regionResult.getFirst()] = regionResult.getSecond(); if (callback != null && !(regionResult.getSecond() instanceof Throwable)) { callback.update(e.getKey(), list.get(regionResult.getFirst()).getRow(), (R)regionResult.getSecond()); } } } } } catch (ExecutionException e) { LOG.warn("Failed all from " + loc, e); } } // step 4: identify failures and prep for a retry (if applicable). // Find failures (i.e. null Result), and add them to the workingList (in // order), so they can be retried. retry = false; workingList.clear(); actionCount = 0; for (int i = 0; i < results.length; i++) { // if null (fail) or instanceof Throwable && not instanceof DNRIOE // then retry that row. else dont. if (results[i] == null || (results[i] instanceof Throwable && !(results[i] instanceof DoNotRetryIOException))) { retry = true; actionCount++; Row row = list.get(i); workingList.add(row); deleteCachedLocation(tableName, row.getRow()); } else { if (results[i] != null && results[i] instanceof Throwable) { actionCount++; } // add null to workingList, so the order remains consistent with the original list argument. workingList.add(null); } } } if (retry) { // Simple little check for 1 item failures. if (singleRowCause != null) { throw new IOException(singleRowCause); } } List<Throwable> exceptions = new ArrayList<Throwable>(actionCount); List<Row> actions = new ArrayList<Row>(actionCount); List<String> addresses = new ArrayList<String>(actionCount); for (int i = 0 ; i < results.length; i++) { if (results[i] == null || results[i] instanceof Throwable) { exceptions.add((Throwable)results[i]); actions.add(list.get(i)); addresses.add(lastServers[i].getHostnamePort()); } } if (!exceptions.isEmpty()) { throw new RetriesExhaustedWithDetailsException(exceptions, actions, addresses); } }
还好有注释帮忙........主要分了四步来做这件事。
step 1
把需要提交的数据重新整合一下
转化成这样:Map<HRegionLocation, MultiAction<R>> , 这个看起来很变扭(囧......),说的直白一些就是Map<集群机器信息(host:port), Map<RegionName,actions(一个实体动作put)>>
说白了就是把Put动作,先按机器分,在一台机器里再按regionName分。
step 2
真正提交请求。
得到返回是 Map<HRegionLocation, Future<MultiResponse>> , 也是按机器分的......
step 3
把返回成功的和失败的结果收集起来,并准备重试
仔细看代码就是 还是原来那个顺序,先按每个regionServer遍历,每个regionServer里又按regionName得到返回值,把所有的返回值放入存储结果的result数组里。
并在最后执行coprocessor的Callback
step 4
确定失败信息,并准备重试
如果 结果数据 results[i] instanceof Throwable && !(results[i] instanceof DoNotRetryIOException ,那么就把数据重新放入到workingList里
最后 再大循环,直到到达重试次数上限。
如果 还有错误信息出现,那么提交会抛异常到上一层。
今天就先看到这儿吧........
等着后面再续.......