RPC 是“ RemoteProcedureCall ”即“远地过程调用”的缩写。这个机制的目的,是让一台机器上的程序能像调用本地的“过程”那样来调用别的机器上的某些过程。需要特别说明的是,RPC 并非针对远地的所有过程,并不是对远地所有的过程都可以随心所欲地通过 RPC 加以调用,而只能针对预先确定的某些过程,并且在程序上得有些准备和安排。RPC 交互的两方,总有一方是通信的主动发起方,也是某种服务的需求方;另一方则是被动的响应方,也是服务的提供方。所以,通信中至少有一方扮演着“服务者”即 Server 的角色。如果是双方对等的通信,则各自都有其作为 Server 的一面。在 Hadoop 的系统结构中,节点有主从( Master / Slave )之分,通常主节点扮演着 Server 的角色,主从节点间的通信都是由从节点发起的,主节点则像是“公仆”。而不同的节点之间的通信则是对等的,谁都可以发起,所以每个从节点也都有作为 Server 的一面。
Server 是个抽象类,未经扩充落实是不能为其创建具体对象的。
public abstract class Server {
//RPC协议类型,RpcInvoker的映射表
static Map<RPC.RpcKind, RpcKindMapValue> rpcKindMap = new
HashMap<RPC.RpcKind, RpcKindMapValue>(4);
//RPC协议缓存表
private static final Map<String, Class<?>> PROTOCOL_CACHE =
new ConcurrentHashMap<String, Class<?>>();
//RPC远程调用
private static final ThreadLocal<Call> CurCall = new ThreadLocal<Call>();
private CallQueueManager<Call> callQueue;
// 远程连接管理
private ConnectionManager connectionManager;
//远程连接监听
private Listener listener = null;
//远程连接响应
private Responder responder = null;
//远程调用处理线程组
private Handler[] handlers = null;
//注册协议
public static void registerProtocolEngine(RPC.RpcKind rpcKind,
Class<? extends Writable> rpcRequestWrapperClass,
RpcInvoker rpcInvoker) {
RpcKindMapValue old =
rpcKindMap.put(rpcKind, new RpcKindMapValue(rpcRequestWrapperClass, rpcInvoker));
if (old != null) {
rpcKindMap.put(rpcKind, old);
throw new IllegalArgumentException("ReRegistration of rpcKind: " +
rpcKind);
}
}
//查询
public static RpcInvoker getRpcInvoker(RPC.RpcKind rpcKind) {
RpcKindMapValue val = rpcKindMap.get(rpcKind);
return (val == null) ? null : val.rpcInvoker;
}
//Rpc远程调用基类
public static class Call implements Schedulable,
PrivilegedExceptionAction<Void> {}
//RPC远程调用
private class RpcCall extends Call {
}
@Deprecated
public Writable call(Writable param, long receiveTime) throws Exception {
return call(RPC.RpcKind.RPC_BUILTIN, null, param, receiveTime);
}
//因为运行时无法确定要call哪一个Invoker
//所以留一个抽象方法,由其子类实现
/** Called for each call. */
public abstract Writable call(RPC.RpcKind rpcKind, String protocol,
Writable param, long receiveTime) throws Exception;
//监听Socket连接
private class Listener extends Thread {}
//请求结束后,返回给调用者
private class Responder extends Thread {}
//连接处理
public class Connection {}
//处理远程调用
private class Handler extends Thread {
......
call.run()
......
}
//读取Socket端数据
private class Reader extends Thread {}
}
RPC 的关键就是在远地调用某个函数,但是具体怎么调用却与所用的“协议”即 protocol 有关。协议不同,下面的代码就不同,所以无法脱离具体的协议来提供实现这个方法的代码。显然,能实际创建的 Server 至少得要补上这个 call ()方法的具体实现才行。在 Java 语言中有两种手段可以做到这个:一种是静态定义一个扩充落实这个抽象类的实体类,在里面提供 call ()方法的代码,然后加以创建;另一种是动态定义,即在通过 new 语句创建 Server 对象时临时补上一个 call ()方法,具体就是在另一个类 RPC 中定义了一个同名的内嵌抽象类 Server ,即 RPC.Server ,用来扩充落实抽象类Server。
public abstract static class Server extends org.apache.hadoop.ipc.Server {
@Override
public Writable call(RPC.RpcKind rpcKind, String protocol,
Writable rpcRequest, long receiveTime) throws Exception {
return getRpcInvoker(rpcKind).call(this, protocol, rpcRequest,
receiveTime);
}
}
}
RPC.Server的call()函数扩充自Server的call(抽象的)函数,这里再根据RpcKind取得具体的RpcInvoker,在RpcInvoker的call里面执行具体的远程调用。getRpcInvoker的实现在Server里,具体的RpcInvoker也是在Server里面由registerProtocolEngine注册的。
public class RPC {
final static int RPC_SERVICE_CLASS_DEFAULT = 0;
//RPC类型一共有三种
public enum RpcKind {
RPC_BUILTIN ((short) 1), // Used for built in calls by tests
RPC_WRITABLE ((short) 2), // Use WritableRpcEngine
RPC_PROTOCOL_BUFFER ((short) 3); // Use ProtobufRpcEngine
}
//RpcInvoker
interface RpcInvoker {
//实际call调用待扩充
public Writable call(Server server, String protocol,
Writable rpcRequest, long receiveTime) throws Exception ;
}
//协议与RpcEngine
private static final Map<Class<?>,RpcEngine> PROTOCOL_ENGINES
= new HashMap<Class<?>,RpcEngine>();
//关联协议与RpcEngine
public static void setProtocolEngine(Configuration conf,
Class<?> protocol, Class<?> engine) {}
//Client端通过代理, T代表协议,如:LocalizationProtocolPB
public static <T> T getProxy(Class<T> protocol,
long clientVersion,
InetSocketAddress addr,
UserGroupInformation ticket,
Configuration conf,
SocketFactory factory,
int rpcTimeout) throws IOException {
return getProtocolProxy(protocol, clientVersion, addr, ticket,
conf, factory, rpcTimeout, null).getProxy();
//Server端通过建造者调用
public static class Builder {}
//Server落实,依然是abstract
public abstract static class Server extends org.apache.hadoop.ipc.Server {}
}
}
RPC中提供了Server和Client调用接口,同时管理着协议与具体的RpcEngine。
请求服务的一方称为客户方,或用户方,这一边定义了一个与 Server 相对应的 Client 类,凡是对外提出服务请求都需要由 Client 对象经手。
public class Client implements AutoCloseable {
//连接管理
private ConcurrentMap<ConnectionId, Connection> connections = new ConcurrentHashMap<>();
//远程调用
static class Call {}
//Socket连接处理
private class Connection extends Thread {
@Override
public void run() {
try {
while (waitForWork()) {//wait here for work - read or close connection
//接收响应
receiveRpcResponse();
} catch (Throwable t) { ...... }
}
//发送RPC请求
public void sendRpcRequest(final Call call){}
}
//由RpcEngine调用
public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
ConnectionId remoteId, AtomicBoolean fallbackToSimpleAuth)
throws IOException {
return call(rpcKind, rpcRequest, remoteId, RPC.RPC_SERVICE_CLASS_DEFAULT,fallbackToSimpleAuth);
}
Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
ConnectionId remoteId, int serviceClass,
AtomicBoolean fallbackToSimpleAuth) throws IOException {
//创建 Call
final Call call = createCall(rpcKind, rpcRequest);
//获得一个远程连接
final Connection connection = getConnection(remoteId, call, serviceClass,fallbackToSimpleAuth);
try {
checkAsyncCall();
try {
//发送请求
connection.sendRpcRequest(call); // send the rpc request
} catch (RejectedExecutionException e) {
throw new IOException("connection has been closed", e);
}
}
if (isAsynchronousMode()) {
final AsyncGet<Writable, IOException> asyncGet
= new AsyncGet<Writable, IOException>() {
@Override
public Writable get(long timeout, TimeUnit unit)
throws IOException, TimeoutException{
boolean done = true;
try {
//获得响应
final Writable w = getRpcResponse(call, connection, timeout, unit);
if (w == null) {
done = false;
throw new TimeoutException(call + " timed out "
+ timeout + " " + unit);
}
return w;
}
};
ASYNC_RPC_RESPONSE.set(asyncGet);
return null;
} else {
return getRpcResponse(call, connection, -1, null);
}
}
//建立连接
private Connection getConnection(ConnectionId remoteId,
Call call, int serviceClass, AtomicBoolean fallbackToSimpleAuth)
throws IOException {
Connection connection;
while (true) {
// These lines below can be shorten with computeIfAbsent in Java8
connection = connections.get(remoteId);
if (connection == null) {
connection = new Connection(remoteId, serviceClass);
Connection existing = connections.putIfAbsent(remoteId, connection);
if (existing != null) {
connection = existing;
}
}
if (connection.addCall(call)) {
break;
} else {
connections.remove(remoteId, connection);
}
}
connection.setupIOstreams(fallbackToSimpleAuth);
return connection;
}
//建立通讯的必要信息
public Connection(ConnectionId remoteId, int serviceClass) throws IOException {}
//建立连接,写请求
private synchronized void setupIOstreams(){}
//发送请求
public void sendRpcRequest(final Call call){}
}
Client通过call调用发起请求,同时建立Connection,最后由receiveRpcResponse()接受响应。
WritableRpcEngine,ProtobufRpcEngine 是RpcEngine的实现
public class ProtobufRpcEngine implements RpcEngine {
//静态成分注册
static { // Register the rpcRequest deserializer for WritableRpcEngine
org.apache.hadoop.ipc.Server.registerProtocolEngine(
RPC.RpcKind.RPC_PROTOCOL_BUFFER, RpcProtobufRequest.class,
new Server.ProtoBufRpcInvoker());
}
//使用JAVA的动态代理,调用具体函数,Client端调用
@Override
@SuppressWarnings("unchecked")
public <T> ProtocolProxy<T> getProxy(Class<T> protocol, long clientVersion,
InetSocketAddress addr, UserGroupInformation ticket, Configuration conf,
SocketFactory factory, int rpcTimeout, RetryPolicy connectionRetryPolicy,
AtomicBoolean fallbackToSimpleAuth) throws IOException {
//Invoker实现了RpcInvocationHandler
final Invoker invoker = new Invoker(protocol, addr, ticket, conf, factory,
rpcTimeout, connectionRetryPolicy, fallbackToSimpleAuth);
return new ProtocolProxy<T>(protocol, (T) Proxy.newProxyInstance(
protocol.getClassLoader(), new Class[]{protocol}, invoker), false);
}
private static class Invoker implements RpcInvocationHandler {
@Override
public Message invoke(Object proxy, final Method method, Object[] args)
throws ServiceException {
try {
//调用Client的call函数执行请求
val = (RpcWritable.Buffer) client.call(RPC.RpcKind.RPC_PROTOCOL_BUFFER,
new RpcProtobufRequest(rpcRequestHeader, theRequest), remoteId,
fallbackToSimpleAuth);
} catch (Throwable e) { }
}
static class RpcProtobufRequest extends RpcWritable.Buffer {}
//Server端调用
@Override
public RPC.Server getServer(Class<?> protocol, Object protocolImpl,
String bindAddress, int port, int numHandlers, int numReaders,
int queueSizePerHandler, boolean verbose, Configuration conf,
SecretManager<? extends TokenIdentifier> secretManager,
String portRangeConfig)
throws IOException {
return new Server(protocol, protocolImpl, conf, bindAddress, port,
numHandlers, numReaders, queueSizePerHandler, verbose, secretManager,
portRangeConfig);
}
//再次扩充RPC.Server
public static class Server extends RPC.Server {
static class ProtoBufRpcInvoker implements RpcInvoker {
//返回具体协议实现
private static ProtoClassProtoImpl getProtocolImpl(RPC.Server server,
String protoName, long clientVersion) throws RpcServerException {
ProtoNameVer pv = new ProtoNameVer(protoName, clientVersion);
ProtoClassProtoImpl impl =
server.getProtocolImplMap(RPC.RpcKind.RPC_PROTOCOL_BUFFER).get(pv);
if (impl == null) { // no match for Protocol AND Version
VerProtocolImpl highest =
server.getHighestSupportedProtocol(RPC.RpcKind.RPC_PROTOCOL_BUFFER,
protoName);
}
return impl;
}
//最后的扩充
public Writable call(RPC.Server server, String protocol, Writable writableRequest, long receiveTime) throws Exception {
RpcProtobufRequest request = (RpcProtobufRequest) writableRequest;
RequestHeaderProto rpcRequest = request.getRequestHeader();
String methodName = rpcRequest.getMethodName();
String protoName = rpcRequest.getDeclaringClassProtocolName();
long clientVersion = rpcRequest.getClientProtocolVersion();
ProtoClassProtoImpl protocolImpl = getProtocolImpl(server, protoName,clientVersion);
//ProtocolBuf提供的调用接口
BlockingService service = (BlockingService) protocolImpl.protocolImpl;
//待调用的方法名
MethodDescriptor methodDescriptor = service.getDescriptorForType().findMethodByName(methodName);
//协议类型
Message prototype = service.getRequestPrototype(methodDescriptor);
//协议参数
Message param = request.getValue(prototype);
try {
server.rpcDetailedMetrics.init(protocolImpl.protocolClass);
//执行调用
result = service.callBlockingMethod(methodDescriptor, null, param);
} catch (ServiceException e) {
exception = (Exception) e.getCause();
throw (Exception) e.getCause();
} catch (Exception e) {
exception = e;
throw e;
}
return RpcWritable.wrap(result);
}
}
}
}
对于每个具体的 Protocol ,有了前述的 RPC 类和可以用来生成 Server 和Proxy 的 ProtocolEngine后, Client 端的编程工作量已可大大降低,但是 Server 端虽然也降低了却总少不了要有一层远程调用的入口函数。这个中间层的作用,就是在机器节点之间建立支持具体 protocol 的 RPC 机制,在本地提供一个跟远地相同的 API ,并提供实现了这个
API 的服务器 server 和代理 proxy ,使得客户端应用层对定义于这个 API 的函数(方法)调用转化成相应的请求报文,交由下面的 IPC 层发送到远地的服务端,在远地又转化成对于那里相同 API上对应函数(方法)的调用,再把调用结果返回到客户端,作为对应用层的返回结果。以ApplicationClientProtocol为例。
hadoop-yarn-project\hadoop-yarn\hadoop-yarn-api\src\main\java\org\apache\hadoop\yarn\api\ApplicationClientProtocol.java
public interface ApplicationClientProtocol extends ApplicationBaseProtocol {
......
public SubmitApplicationResponse submitApplication( SubmitApplicationRequest request)
throws YarnException, IOException;
......
}
hadoop-yarn-project\hadoop-yarn\hadoop-yarn-api\src\main\proto\applicationclient_protocol.proto
对应的PB协议 : applicationclient_protocol.proto
service ApplicationClientProtocolService {
rpc getNewApplication (GetNewApplicationRequestProto) returns (GetNewApplicationResponseProto);
rpc getApplicationReport (GetApplicationReportRequestProto) returns (GetApplicationReportResponseProto);
rpc submitApplication (SubmitApplicationRequestProto) returns (SubmitApplicationResponseProto);
......
}
hadoop-yarn-api\target\generated-sources\java\org\apache\hadoop\yarn\proto\ApplicationClientProtocol.java
生成的PB文件 : ApplicationClientProtocol.java
public final class ApplicationClientProtocol {
// 两个界面的定义
public interface Interface { //异步操作
public abstract void submitApplication();
}
public interface BlockingInterface { //同步操作
public SubmitApplicationResponseProto submitApplication()
}
//对 com.google.protobuf.Service 界面上三个方法的实现
public final void callMethod(
com.google.protobuf.Descriptors.MethodDescriptor method,
com.google.protobuf.RpcController controller,
com.google.protobuf.Message request,
com.google.protobuf.RpcCallback<
com.google.protobuf.Message> done) {
switch(method.getIndex()) {
case 0:
submitApplication();
break;
}
//服务端 Server 的实现
public static com.google.protobuf.Service newReflectiveService(final Interface impl){
public void submitApplication(){
impl.submitApplication(controller, request, done);
}
}
public static com.google.protobuf.Service newReflectiveBlockingService(BlockingInterface impl){
public void submitApplication(){
impl.submitApplication(controller, request, done);
}
}
//客户端 Stub 即 Proxy 的创建
public static Stub newStub(
com.google.protobuf.RpcChannel channel) {
return new Stub(channel);
}
class Stub extends ApplicationClientProtocolService implements Interface {}
submitApplication (){
channel.callMethod(
}
}
}
Hadoop 采用了 ProtoBuf 所提供的报文生成/解析和串行化/去串行化功能,服务端大体上采用了 ProtoBuf 所提供的 BlockingService 界面,但是客户端采用的是 Reflection 机制所提供的 Proxy 技术。动态代理。