just recovered from a disease,i should finish the retain part of work now...
yes,as u can see,the Scan/Get oper from hbase is some more tricky,as the kvs multi-dimensions and related to storefile and memstore.
----Part 1:Abstract
first ,have a glance at the query flow below:
there are two main loops in a Scan oper:
1. iterate rowkeys -controls how to filter per row kvs and limit the number of rows returned to client.
2.iterate qualifiers of the same row-determine which qulifiers to be matched and how many verions of a same qualifier.
second,here is a Heap Tree Search Model of hbase
there are three level scanners in hbase :
1.region scanner-use to combine all kvs from store scanners
2. store scanner-integrate all underlying storefile scanners and one memstore scanner
3.storefile /memstore scanner-the leafe scanners in hbase,these are the data sources
every child lifts the min references to parent,for the later ,will contains a heap called KeyValueHeap which implements "heap merge",that is choosed a scanner whcih has the min kv as current sacnner.the heap also has a in-built "PriorityQueue" to achieve the min scanner.
----Part 2:some utilities classes
class | usage | |
RegionScannerImpl | implement of region scanner | |
ScanQueryMatcher | whether a qualifier is to filter out or remain or see to next row/col; and columns amount checking ,number of verions... |
|
ColumnTracker | tracks the expected qualifiers and versions,used with above | |
HFileReaderV2 | how to read a kv from a hfile or a data block | |
MemStore | the writing cache of a store |
----Part 3:Implementions
below loop is corresponding the first loop above:
public Result[] next(final long scannerId, int nbRows) throws IOException { ... //2 -- FIRST LOOP:rowKeys for (int i = 0; i < nbRows //limit by client size && currentScanResultSize < maxScannerResultSize; i++) { //limit by return byte size requestCount.incrementAndGet(); // Collect values to be returned here boolean moreRows = s.next(values, SchemaMetrics.METRIC_NEXTSIZE); //-go into RegionScannerImpl if (!values.isEmpty()) { for (KeyValue kv : values) { currentScanResultSize += kv.heapSize(); } results.add(new Result(values)); } if (!moreRows) { break; } values.clear(); } .... }
second loop:
/**依据scannerid记录的历史scan偏移,取出limit(实际是batch) fields/cols of a row;此方法是对底层(memstore,hfiles)scan操作的包装类 * @return true if exists more rows(only use for real Scan to stop remain iterations) */ private boolean nextInternal(int limit, String metric) throws IOException { RpcCallContext rpcCall = HBaseServer.getCurrentCall(); while (true) { //outer loop:skip unmatched rows if (rpcCall != null) { // If a user specifies a too-restrictive or too-slow scanner, the // client might time out and disconnect while the server side // is still processing the request. We should abort aggressively/竭力地 // in that case. rpcCall.throwExceptionIfCallerDisconnected(); } //--NOTE this is rowKey only instead of composite key(used in heap to switch scanners) to compare! byte [] currentRow = peekRow(); //-so the second loop to will need to compare qualifies in fact if (isStopRow(currentRow)) { //--NOTE case in providing a stop row,this will avoid iterating the remain rows if (filter != null && filter.hasFilterRow()) { filter.filterRow(results); } if (filter != null && filter.filterRow()) { //filterRow:true to exclude row results.clear(); } return false; } else if (filterRowKey(currentRow)) { //-NOTE ignore needless row,eg. PrefixFilter if currentRow is less then the prefix nextRow(currentRow); } else { byte [] nextRow; do {//-1.how to locate the expected row,2 how to retrieve all kvs about this row? see StoreScanner#next(...) @A this.storeHeap.next(results, limit - results.size(), metric); if (limit > 0 && results.size() == limit) { if (this.filter != null && filter.hasFilterRow()) { throw new IncompatibleFilterException(//client processed also,@see Scan#setBatch() "Filter with filterRow(List<KeyValue>) incompatible with scan with limit!"); } return true; // we are expecting more yes, but also limited to how many we can return. } //-this loop is for:if one store scanner iterate completely,go to next one } while (Bytes.equals(currentRow, nextRow = peekRow())); //-note:use row key only to compare;retrieve same rowkey's fields(cols) final boolean stopRow = isStopRow(nextRow); //-ture for Get oper;but maybe false for real Scan // now that we have an entire row, lets process with a filters: // first filter with the filterRow(List) if (filter != null && filter.hasFilterRow()) { filter.filterRow(results); //--exclude or adjust the results before returning to clent for improving perf? } if (results.isEmpty() || filterRow()) { // this seems like a redundant step - we already consumed the row // there're no left overs. // the reasons for calling this method are: // 1. reset the filters. // 2. provide a hook to fast forward the row (used by subclasses) nextRow(currentRow); // This row was totally filtered out, if this is NOT the last row, // we should continue on.-so continue the next round seeking the expect row if (!stopRow) continue; //-not Get oper(ie. Scan),iterrate the next row } return !stopRow; //-if this is a stoprow(no more rows),return false }//else }//while }
KeyValueHeap#next(xxx)
/** * Gets the next row of keys from the top-most scanner.--will be invoked in loop for the same rowKey * <p> * This method takes care of updating the heap. * <p> * This can ONLY be called when you are using Scanners that implement * InternalScanner as well as KeyValueScanner (a {@link StoreScanner}). * @param result output result list * @param limit limit on row count to get * @param metric the metric name * @return true if there are more keys, false if all scanners are done */ public boolean next(List<KeyValue> result, int limit, String metric) throws IOException { if (this.current == null) { return false; } InternalScanner currentAsInternal = (InternalScanner)this.current; //-StoreScanner if invoked from region level; boolean mayContainMoreRows = currentAsInternal.next(result, limit, metric); KeyValue pee = this.current.peek(); //-whether this scanner exists more data?if false then destroy it by close() below /* * By definition, any InternalScanner must return false only when it has no * further rows to be fetched. So, we can close a scanner if it returns * false. All existing implementations seem to be fine with this. It is much * more efficient to close scanners which are not needed than keep them in * the heap. This is also required for certain optimizations.--this descriptions maybe focus older versions as below has closed */ if (pee == null || !mayContainMoreRows) { //-no more data this.current.close(); } else { this.heap.add(this.current); //-put back to priority queue , usage:switch StoreScanner to seek smallest kv?yes } this.current = pollRealKV(); //---then acquire the smallest-kv scanner in current again return (this.current != null); }
KeyValueHeap#next()---switch the min sanner per invoking
/** --return the next kv and update the current scanner if any * --if this scanner is the region level,this method *maybe* switch the current smallest kv among multi-memstores/storefiles. * example,same row with certain col updates: * ------------------------------------- * order | ts | ms1 || order | ts |ms2 * ------------------------------------- * 1 | 5 | q1 || 1 | 4 | q2 * 2 | 3 | q1 || 2 | 2 | q2 --switch MemStoreScanners/StoreFileScanners by ts as 3 < 4,so this time will use ms2 scanner as current */ public KeyValue next() throws IOException { if(this.current == null) { //maybe null if closed or some cases below pollReaKV() return null; } KeyValue kvReturn = this.current.next(); //-@A retrieve kv of current scanner;this position if not changed if it is from put back to heap KeyValue kvNext = this.current.peek(); //-prepare:also probe the next scanner for next time to invoke this method if (kvNext == null) { this.current.close(); this.current = pollRealKV(); } else { KeyValueScanner topScanner = this.heap.peek(); if (topScanner == null || //-in fact is needless here this.comparator.compare(kvNext, topScanner.peek()) >= 0) { //-NOTE multi-memstores/storefiles case:switch the smallest(latest) scanners by kv this.heap.add(this.current); //put back to heap .NOTE the next time to invoke next() will seek to next kv correctly,see @A this.current = pollRealKV(); //-then acquire the kv-least(newest) scanner } } return kvReturn; }
StoreScanner#next()
/**--second loop for retrieve * --most of methods here are synchronized,but note that this class is a instance per Get,so i think this will not decrease the perf * Get the next row of values from this Store.- how to guarantee all kvs are belong same row?see below @A * @param outResult * @param limit ---1 means all kvs to be returned * @return true if there are more rows, false if scanner is done */ @Override public synchronized boolean next(List<KeyValue> outResult, int limit, String metric) throws IOException { //--some preconditions every time invoke this method for the same/different row key-- if (checkReseek()) { return true; } // if the heap was left null, then the scanners had previously run out anyways, close and // return. if (this.heap == null) { close(); return false; } KeyValue peeked = this.heap.peek(); //-actually use this.current to do if (peeked == null) { close(); return false; } // only call setRow if the row changes; avoids confusing the query matcher--NOTE keep the resulting kvs all belong this row // if scanning intra-row/行内-that is same rowKey @A if ((matcher.row == null) || !peeked.matchingRow(matcher.row)) { matcher.setRow(peeked.getRow()); } KeyValue kv; KeyValue prevKV = null; // Only do a sanity-check if store and comparator are available. KeyValue.KVComparator comparator = store != null ? store.getComparator() : null; long cumulativeMetric = 0; int count = 0; try { //--NOTE SECOND LOOP:qualifies LOOP: while((kv = this.heap.peek()) != null) { //-iterate throughout current row's kvs // Check that the heap gives us KVs in an increasing order. assert prevKV == null || comparator == null || comparator.compare(prevKV, kv) <= 0 : "Key " + prevKV + " followed by a " + "smaller key " + kv + " in cf " + store; prevKV = kv; ScanQueryMatcher.MatchCode qcode = matcher.match(kv); switch(qcode) { case INCLUDE: //-all are backed to client below cases case INCLUDE_AND_SEEK_NEXT_ROW: case INCLUDE_AND_SEEK_NEXT_COL: Filter f = matcher.getFilter(); outResult.add(f == null ? kv : f.transform(kv)); //-maybe do a simplified conversion of kv,eg. KeyOnlyFilter use key only count++; if (qcode == ScanQueryMatcher.MatchCode.INCLUDE_AND_SEEK_NEXT_ROW) { if (!matcher.moreRowsMayExistAfter(kv)) { //--a simple,effect mean to return directly without doing a loop to check again return false; } reseek(matcher.getKeyForNextRow(kv)); //-construct a fake kv to quickly locate to the next row position, } else if (qcode == ScanQueryMatcher.MatchCode.INCLUDE_AND_SEEK_NEXT_COL) { reseek(matcher.getKeyForNextColumn(kv)); //---do it here is more effect than outer loop } else { //-case INCLUDE this.heap.next(); //--here will not miss the retunred kv as we will use peak() to use this kv before any next() } cumulativeMetric += kv.getLength(); if (limit > 0 && (count == limit)) { //-reach the limit,return break LOOP; //-same as break } continue; //-for include case DONE: return true; case DONE_SCAN: //-this scan is complete close(); return false; case SEEK_NEXT_ROW: //-a bit like appropriate include one; // This is just a relatively simple end of scan fix, to short-cut end // us if there is an endKey in the scan. if (!matcher.moreRowsMayExistAfter(kv)) { //--quick effect way to terminate this get/scan oper return false; } reseek(matcher.getKeyForNextRow(kv)); break; case SEEK_NEXT_COL: //-a bit like appropriate include one reseek(matcher.getKeyForNextColumn(kv)); break; case SKIP: //-return's value is checked by count(number of versions);ignore this col and seek to next one this.heap.next(); break; case SEEK_NEXT_USING_HINT: KeyValue nextKV = matcher.getNextKeyHint(kv); if (nextKV != null) { reseek(nextKV); } else { heap.next(); } break; default: throw new RuntimeException("UNEXPECTED"); }//switch }//while } finally { if (cumulativeMetric > 0 && metric != null) { RegionMetricsStorage.incrNumericMetric(this.metricNamePrefix + metric, cumulativeMetric); } } if (count > 0) { return true; } // No more keys close(); return false; }
----Part 4: FAQs
ref: