HBase Version: hbase-0.94.6-cdh4.3.0
在HBase Scan中有一优化: 使用 scannerCaching&caching.
意思是HBaseClient从HBase服务器一次取得多少条数据回来,减少从服务器来回取数据的次数,可以设置一次从HBase服务器取scannerCaching&caching条数据.
其中scannerCaching是HTable的属性;caching是Scan的属性;
在HTable源码中可以发现, 此两个方法都已经过时了.
public class HTable implements HTableInterface { protected int scannerCaching; /** * Gets the number of rows that a scanner will fetch at once. ** The default value comes from {@code hbase.client.scanner.caching}. * @deprecated Use {@link Scan#setCaching(int)} and {@link Scan#getCaching()} */ public int getScannerCaching() { return scannerCaching; } /** * Sets the number of rows that a scanner will fetch at once. *
* This will override the value specified by * {@code hbase.client.scanner.caching}. * Increasing this value will reduce the amount of work needed each time * {@code next()} is called on a scanner, at the expense of memory use * (since more rows will need to be maintained in memory by the scanners). * @param scannerCaching the number of rows a scanner will fetch at once. * @deprecated Use {@link Scan#setCaching(int)} */ public void setScannerCaching(int scannerCaching) { this.scannerCaching = scannerCaching; } }
要我们在使用中, 使用Scan的{@link Scan#setCaching(int)} and {@link Scan#getCaching()}
通过HTable源码知道, HTable进行Scan时调用, 返回ResultScanner,再对查询出的ResultScanner进行处理;
/** * {@inheritDoc} */ @Override public ResultScanner getScanner(final Scan scan) throws IOException { if (scan.getCaching() <= 0) { scan.setCaching(getScannerCaching()); } return new ClientScanner(getConfiguration(), scan, getTableName(), this.connection); }
通过上面代码知道,HBase HTable设置的scannerCaching是赋值到scan上的.
1. 在ClientScanner代码中获取了从HTable中过来的scannerCaching;
2. 当然如果HTable中过来的scannerCaching;没有设置(scannerCaching=0); 则ClientScanner中的caching保留原值;
/** * Create a new ClientScanner for the specified table * Note that the passed {@link Scan}'s start row maybe changed changed. * * @param conf The {@link Configuration} to use. * @param scan {@link Scan} to use in this scanner * @param tableName The table that we wish to scan * @param connection Connection identifying the cluster * @throws IOException */ public ClientScanner(final Configuration conf, final Scan scan, final byte[] tableName, HConnection connection) throws IOException { if (LOG.isDebugEnabled()) { LOG.debug("Creating scanner over " + Bytes.toString(tableName) + " starting at key '" + Bytes.toStringBinary(scan.getStartRow()) + "'"); } this.scan = scan; this.tableName = tableName; this.lastNext = System.currentTimeMillis(); this.connection = connection; this.maxScannerResultSize = conf.getLong( HConstants.HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE); this.scannerTimeout = (int) conf.getLong( HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY, HConstants.DEFAULT_HBASE_REGIONSERVER_LEASE_PERIOD); // check if application wants to collect scan metrics byte[] enableMetrics = scan.getAttribute( Scan.SCAN_ATTRIBUTES_METRICS_ENABLE); if (enableMetrics != null && Bytes.toBoolean(enableMetrics)) { scanMetrics = new ScanMetrics(); } // Use the caching from the Scan. If not set, use the default cache setting for this table. if (this.scan.getCaching() > 0) { this.caching = this.scan.getCaching(); } else { this.caching = conf.getInt("hbase.client.scanner.caching", 1); } // initialize the scanner nextScanner(this.caching, false); }
3. 还有一个地方就是如果没有设置1和2(HTable和Scan都没有设置 scannerCaching&caching),;
杯具就来了:使用默认的hbase.client.scanner.caching=1