源码解析基于HBase-0.20.6。
先看HTable类get()方法的code:
HTable.java
/** * Extracts certain cells from a given row. * @param get The object that specifies what data to fetch and from which row. * @return The data coming from the specified row, if it exists. If the row * specified doesn't exist, the {@link Result} instance returned won't * contain any {@link KeyValue}, as indicated by {@link Result#isEmpty()}. * @throws IOException if a remote or network exception occurs. * @since 0.20.0 */ public Result get(final Get get) throws IOException { return connection.getRegionServerWithRetries( new ServerCallable<Result>(connection, tableName, get.getRow()) { public Result call() throws IOException { return server.get(location.getRegionInfo().getRegionName(), get); } } ); }
这段code 比较绕,但至少我们知道可以去查connection的getRegionServerWithRetries方法。那么connection是个什么东西呢?
这个玩意是定义在HTable里面的:
private final HConnection connection;
何时实例化的呢?在HTable的构造函数里面:
this.connection = HConnectionManager.getConnection(conf);
这个conf是一个HBaseConfiguration对象,是HTable构造函数的参数。OK,继续道HConnectionManager里面看看这个connection怎么来的吧:
HConnectionManager.java
/** * Get the connection object for the instance specified by the configuration * If no current connection exists, create a new connection for that instance * @param conf * @return HConnection object for the instance specified by the configuration */ public static HConnection getConnection(HBaseConfiguration conf) { TableServers connection; synchronized (HBASE_INSTANCES) { connection = HBASE_INSTANCES.get(conf); if (connection == null) { connection = new TableServers(conf); HBASE_INSTANCES.put(conf, connection); } } return connection; }
现在我们知道每一个conf对应一个connection,具体来说是TableServers类对象(实现了HConnection接口),所有的connections放在一个pool里。那么connection到底干嘛用呢?我们要看看HConnection这个接口的定义。
HConnection.java
/** * Cluster connection. * {@link HConnectionManager} manages instances of this class. */ public interface HConnection { /** * Retrieve ZooKeeperWrapper used by the connection. * @return ZooKeeperWrapper handle being used by the connection. * @throws IOException */ public ZooKeeperWrapper getZooKeeperWrapper() throws IOException; /** * @return proxy connection to master server for this instance * @throws MasterNotRunningException */ public HMasterInterface getMaster() throws MasterNotRunningException; /** @return - true if the master server is running */ public boolean isMasterRunning(); /** * Checks if <code>tableName</code> exists. * @param tableName Table to check. * @return True if table exists already. * @throws MasterNotRunningException */ public boolean tableExists(final byte [] tableName) throws MasterNotRunningException; /** * A table that isTableEnabled == false and isTableDisabled == false * is possible. This happens when a table has a lot of regions * that must be processed. * @param tableName * @return true if the table is enabled, false otherwise * @throws IOException */ public boolean isTableEnabled(byte[] tableName) throws IOException; /** * @param tableName * @return true if the table is disabled, false otherwise * @throws IOException */ public boolean isTableDisabled(byte[] tableName) throws IOException; /** * @param tableName * @return true if all regions of the table are available, false otherwise * @throws IOException */ public boolean isTableAvailable(byte[] tableName) throws IOException; /** * List all the userspace tables. In other words, scan the META table. * * If we wanted this to be really fast, we could implement a special * catalog table that just contains table names and their descriptors. * Right now, it only exists as part of the META table's region info. * * @return - returns an array of HTableDescriptors * @throws IOException */ public HTableDescriptor[] listTables() throws IOException; /** * @param tableName * @return table metadata * @throws IOException */ public HTableDescriptor getHTableDescriptor(byte[] tableName) throws IOException; /** * Find the location of the region of <i>tableName</i> that <i>row</i> * lives in. * @param tableName name of the table <i>row</i> is in * @param row row key you're trying to find the region of * @return HRegionLocation that describes where to find the reigon in * question * @throws IOException */ public HRegionLocation locateRegion(final byte [] tableName, final byte [] row) throws IOException; /** * Find the location of the region of <i>tableName</i> that <i>row</i> * lives in, ignoring any value that might be in the cache. * @param tableName name of the table <i>row</i> is in * @param row row key you're trying to find the region of * @return HRegionLocation that describes where to find the reigon in * question * @throws IOException */ public HRegionLocation relocateRegion(final byte [] tableName, final byte [] row) throws IOException; /** * Establishes a connection to the region server at the specified address. * @param regionServer - the server to connect to * @return proxy for HRegionServer * @throws IOException */ public HRegionInterface getHRegionConnection(HServerAddress regionServer) throws IOException; /** * Establishes a connection to the region server at the specified address. * @param regionServer - the server to connect to * @param getMaster - do we check if master is alive * @return proxy for HRegionServer * @throws IOException */ public HRegionInterface getHRegionConnection( HServerAddress regionServer, boolean getMaster) throws IOException; /** * Find region location hosting passed row * @param tableName * @param row Row to find. * @param reload If true do not use cache, otherwise bypass. * @return Location of row. * @throws IOException */ HRegionLocation getRegionLocation(byte [] tableName, byte [] row, boolean reload) throws IOException; /** * Pass in a ServerCallable with your particular bit of logic defined and * this method will manage the process of doing retries with timed waits * and refinds of missing regions. * * @param <T> the type of the return value * @param callable * @return an object of type T * @throws IOException * @throws RuntimeException */ public <T> T getRegionServerWithRetries(ServerCallable<T> callable) throws IOException, RuntimeException; /** * Pass in a ServerCallable with your particular bit of logic defined and * this method will pass it to the defined region server. * @param <T> the type of the return value * @param callable * @return an object of type T * @throws IOException * @throws RuntimeException */ public <T> T getRegionServerForWithoutRetries(ServerCallable<T> callable) throws IOException, RuntimeException; /** * Process a batch of Puts. Does the retries. * @param list A batch of Puts to process. * @param tableName The name of the table * @return Count of committed Puts. On fault, < list.size(). * @throws IOException */ public int processBatchOfRows(ArrayList<Put> list, byte[] tableName) throws IOException; /** * Process a batch of Deletes. Does the retries. * @param list A batch of Deletes to process. * @return Count of committed Deletes. On fault, < list.size(). * @param tableName The name of the table * @throws IOException */ public int processBatchOfDeletes(ArrayList<Delete> list, byte[] tableName) throws IOException; }
上面的code是整个接口的定义,我们现在知道这玩意是封装了一些客户端查询处理请求,像put、delete这些封装在方法
public <T> T getRegionServerWithRetries(ServerCallable<T> callable) 里执行,put、delete等被封装在callable里面。这也就是为我们刚才在HTable.get()里看到的。
到这里要看TableServers.getRegionServerWithRetries(ServerCallable<T> callable)了,继续看code
public <T> T getRegionServerWithRetries(ServerCallable<T> callable) throws IOException, RuntimeException { List<Throwable> exceptions = new ArrayList<Throwable>(); for(int tries = 0; tries < numRetries; tries++) { try { callable.instantiateServer(tries!=0); return callable.call(); } catch (Throwable t) { t = translateException(t); exceptions.add(t); if (tries == numRetries - 1) { throw new RetriesExhaustedException(callable.getServerName(), callable.getRegionName(), callable.getRow(), tries, exceptions); } } try { Thread.sleep(getPauseTime(tries)); } catch (InterruptedException e) { // continue } } return null; }
比较核心的code就那两句,首先根据callable对象来完成一些定位ReginServer的工作,然后执行call来进行请求,这里要注意这个call方法是在最最最最开始的HTable.get里面的内部类里重写的。看ServerCallable类的一部分code:
public abstract class ServerCallable<T> implements Callable<T> { protected final HConnection connection; protected final byte [] tableName; protected final byte [] row; protected HRegionLocation location; protected HRegionInterface server; /** * @param connection * @param tableName * @param row */ public ServerCallable(HConnection connection, byte [] tableName, byte [] row) { this.connection = connection; this.tableName = tableName; this.row = row; } /** * * @param reload set this to true if connection should re-find the region * @throws IOException */ public void instantiateServer(boolean reload) throws IOException { this.location = connection.getRegionLocation(tableName, row, reload); this.server = connection.getHRegionConnection(location.getServerAddress()); }
所以一个ServerCallable对象包括tableName,row等,并且会通过构造函数传入一个connection引用,并且会调用该connection.getHRegionConnection方法来获取跟RegionServer打交道的一个handle(其实我也不知道称呼它啥了,不能叫connection吧,那就重复了,所以说HBase代码起的名字让我很ft,会误解)。
具体看怎么获得这个新玩意的:
HConnectinManager.java
public HRegionInterface getHRegionConnection(
HServerAddress regionServer, boolean getMaster)
throws IOException {
if (getMaster) {
getMaster();
}
HRegionInterface server;
synchronized (this.servers) {
// See if we already have a connection
server = this.servers.get(regionServer.toString());
if (server == null) { // Get a connection
try {
server = (HRegionInterface)HBaseRPC.waitForProxy(
serverInterfaceClass, HBaseRPCProtocolVersion.versionID,
regionServer.getInetSocketAddress(), this.conf,
this.maxRPCAttempts, this.rpcTimeout);
} catch (RemoteException e) {
throw RemoteExceptionHandler.decodeRemoteException(e);
}
this.servers.put(regionServer.toString(), server);
}
}
return server;
}
再挖下去看这个server怎么出来的(HBaseRPC类里面):
public static VersionedProtocol getProxy(Class<?> protocol, long clientVersion, InetSocketAddress addr, UserGroupInformation ticket, Configuration conf, SocketFactory factory) throws IOException { VersionedProtocol proxy = (VersionedProtocol) Proxy.newProxyInstance( protocol.getClassLoader(), new Class[] { protocol }, new Invoker(addr, ticket, conf, factory)); long serverVersion = proxy.getProtocolVersion(protocol.getName(), clientVersion); if (serverVersion == clientVersion) { return proxy; } throw new VersionMismatch(protocol.getName(), clientVersion, serverVersion); }
这两部分code看出用到了java的动态代理机制,server是一个动态代理对象,实现了变量serverInterfaceClass指定的接口。在这里也就是HRegionInterface,也就是说server实现了该接口的内容。那么该接口定义哪些方法呢?
public interface HRegionInterface extends HBaseRPCProtocolVersion { /** * Get metainfo about an HRegion * * @param regionName name of the region * @return HRegionInfo object for region * @throws NotServingRegionException */ public HRegionInfo getRegionInfo(final byte [] regionName) throws NotServingRegionException; /** * Return all the data for the row that matches <i>row</i> exactly, * or the one that immediately preceeds it. * * @param regionName region name * @param row row key * @param family Column family to look for row in. * @return map of values * @throws IOException */ public Result getClosestRowBefore(final byte [] regionName, final byte [] row, final byte [] family) throws IOException; /** * * @return the regions served by this regionserver */ public HRegion [] getOnlineRegionsAsArray(); /** * Perform Get operation. * @param regionName name of region to get from * @param get Get operation * @return Result * @throws IOException */ public Result get(byte [] regionName, Get get) throws IOException; /** * Perform exists operation. * @param regionName name of region to get from * @param get Get operation describing cell to test * @return true if exists * @throws IOException */ public boolean exists(byte [] regionName, Get get) throws IOException; /** * Put data into the specified region * @param regionName * @param put the data to be put * @throws IOException */ public void put(final byte [] regionName, final Put put) throws IOException; /** * Put an array of puts into the specified region * * @param regionName * @param puts * @return The number of processed put's. Returns -1 if all Puts * processed successfully. * @throws IOException */ public int put(final byte[] regionName, final Put [] puts) throws IOException; /** * Deletes all the KeyValues that match those found in the Delete object, * if their ts <= to the Delete. In case of a delete with a specific ts it * only deletes that specific KeyValue. * @param regionName * @param delete * @throws IOException */ public void delete(final byte[] regionName, final Delete delete) throws IOException; /** * Put an array of deletes into the specified region * * @param regionName * @param deletes * @return The number of processed deletes. Returns -1 if all Deletes * processed successfully. * @throws IOException */ public int delete(final byte[] regionName, final Delete [] deletes) throws IOException; /** * Atomically checks if a row/family/qualifier value match the expectedValue. * If it does, it adds the put. * * @param regionName * @param row * @param family * @param qualifier * @param value the expected value * @param put * @throws IOException * @return true if the new put was execute, false otherwise */ public boolean checkAndPut(final byte[] regionName, final byte [] row, final byte [] family, final byte [] qualifier, final byte [] value, final Put put) throws IOException; /** * Atomically increments a column value. If the column value isn't long-like, * this could throw an exception. * * @param regionName * @param row * @param family * @param qualifier * @param amount * @param writeToWAL whether to write the increment to the WAL * @return new incremented column value * @throws IOException */ public long incrementColumnValue(byte [] regionName, byte [] row, byte [] family, byte [] qualifier, long amount, boolean writeToWAL) throws IOException; // // remote scanner interface // /** * Opens a remote scanner with a RowFilter. * * @param regionName name of region to scan * @param scan configured scan object * @return scannerId scanner identifier used in other calls * @throws IOException */ public long openScanner(final byte [] regionName, final Scan scan) throws IOException; /** * Get the next set of values * @param scannerId clientId passed to openScanner * @return map of values; returns null if no results. * @throws IOException */ public Result next(long scannerId) throws IOException; /** * Get the next set of values * @param scannerId clientId passed to openScanner * @param numberOfRows the number of rows to fetch * @return Array of Results (map of values); array is empty if done with this * region and null if we are NOT to go to the next region (happens when a * filter rules that the scan is done). * @throws IOException */ public Result [] next(long scannerId, int numberOfRows) throws IOException; /** * Close a scanner * * @param scannerId the scanner id returned by openScanner * @throws IOException */ public void close(long scannerId) throws IOException; /** * Opens a remote row lock. * * @param regionName name of region * @param row row to lock * @return lockId lock identifier * @throws IOException */ public long lockRow(final byte [] regionName, final byte [] row) throws IOException; /** * Releases a remote row lock. * * @param regionName * @param lockId the lock id returned by lockRow * @throws IOException */ public void unlockRow(final byte [] regionName, final long lockId) throws IOException; /** * Method used when a master is taking the place of another failed one. * @return All regions assigned on this region server * @throws IOException */ public HRegionInfo[] getRegionsAssignment() throws IOException; /** * Method used when a master is taking the place of another failed one. * @return The HSI * @throws IOException */ public HServerInfo getHServerInfo() throws IOException; }
可以看出HRegionInterface是定义了具体的向RegionServer查询的方法。
现在回过头来,当server这个动态代理对象实例化后,经过ServerCallable.call() 最后会调到server.get()。按照java的代理机制,又会传递到我们在构造这个动态代理对象时候传进去的new Invoker(addr, ticket, conf, factory))对象去执行具体的方法。
简单的说,这个Invoker对象使用HBase的RPC客户端跟RegionServer通信完成请求以及结果接收等等。
看看这个RPC客户端长什么样吧:
public Invoker(InetSocketAddress address, UserGroupInformation ticket, Configuration conf, SocketFactory factory) { this.address = address; this.ticket = ticket; this.client = CLIENTS.getClient(conf, factory); //client就是RPC客户端 }
这个client是HBaseClient类的对象,这个HBaseClient类就是HBase中用来做RPC的客户端类。在这里HBaseClient也做了一个pool机制,不理解。。。code里面的注释如下:
// Construct & cache client. The configuration is only used for timeout,
// and Clients have connection pools. So we can either (a) lose some
// connection pooling and leak sockets, or (b) use the same timeout for all
// configurations. Since the IPC is usually intended globally, not
// per-job, we choose (a).
继续说下去,看这么一个client怎么完成最后的请求:
public Writable call(Writable param, InetSocketAddress addr, UserGroupInformation ticket) throws IOException { Call call = new Call(param); Connection connection = getConnection(addr, ticket, call); connection.sendParam(call); // send the parameter boolean interrupted = false; synchronized (call) { while (!call.done) { try { call.wait(); // wait for the result } catch (InterruptedException ie) { // save the fact that we were interrupted interrupted = true; } } if (interrupted) { // set the interrupt flag now that we are done waiting Thread.currentThread().interrupt(); } if (call.error != null) { if (call.error instanceof RemoteException) { call.error.fillInStackTrace(); throw call.error; } // local exception throw wrapException(addr, call.error); } return call.value; } }
又见connection,这次的connection可是用来发送接收数据用的thread了。从getConnection(addr, ticket, call)推断又是一个pool,果不其然:
/** Get a connection from the pool, or create a new one and add it to the * pool. Connections to a given host/port are reused. */ private Connection getConnection(InetSocketAddress addr, UserGroupInformation ticket, Call call) throws IOException { if (!running.get()) { // the client is stopped throw new IOException("The client is stopped"); } Connection connection; /* we could avoid this allocation for each RPC by having a * connectionsId object and with set() method. We need to manage the * refs for keys in HashMap properly. For now its ok. */ ConnectionId remoteId = new ConnectionId(addr, ticket); do { synchronized (connections) { connection = connections.get(remoteId); if (connection == null) { connection = new Connection(remoteId); connections.put(remoteId, connection); } } } while (!connection.addCall(call)); //we don't invoke the method below inside "synchronized (connections)" //block above. The reason for that is if the server happens to be slow, //it will take longer to establish a connection and that will slow the //entire system down. connection.setupIOstreams(); return connection; }
也就是说,只要所要查询的RegionServer的addr和用户组信息一样,就会共享一个connection。connection拿到后会将当前call放进自己内部的一个队列里(维护着call的id=》call的一个映射),当call完成后会更新call的状态(主要是否完成这么一个标志Call.done以及将请求结果填充在Call.value里)。
好了现在的情形是,现在看connection如何发送请求数据。
/** Initiates a call by sending the parameter to the remote server. * Note: this is not called from the Connection thread, but by other * threads. * @param call */ public void sendParam(Call call) { if (shouldCloseConnection.get()) { return; } DataOutputBuffer d=null; try { synchronized (this.out) { if (LOG.isDebugEnabled()) LOG.debug(getName() + " sending #" + call.id); //for serializing the //data to be written d = new DataOutputBuffer(); d.writeInt(call.id); call.param.write(d); byte[] data = d.getData(); int dataLength = d.getLength(); out.writeInt(dataLength); //first put the data length out.write(data, 0, dataLength);//write the data out.flush(); } } catch(IOException e) { markClosed(e); } finally { //the buffer is just an in-memory buffer, but it is still polite to // close early IOUtils.closeStream(d); } }
从code里面看出,请求发送是synchronized,所以会有上一篇日志里提到的问题。
HBase客户端的code先看到这里吧。
下面这个图帮助理解一下上面各种pool