protobuf java rpc 实战

文章目录

  • 前言
    • Hadoop Client Rpc 通信
    • Hadoop Server Rpc通信
  • 一、Protobuf是什么?
  • 二、Protobuf实现(Local)
    • 1.定义.proto文件
    • 2.生成代码
    • 3.写出文件
  • 三、Protobuf Message基础操作(Local)
  • 四、Protobuf 实现 (Stub+Server)
    • 1.流程图
    • 2.包含客户端实现的.proto文件生成接口类
    • 3.具体实现以及源码解析
  • 五、Protobuf Service 具体操作 (Server)
  • 六、完整代码


前言

最近学习Hadoop Rpc框架,Hadoop自己基于Protobuf框架实现了内部通信,先来看看标准的Rpc通信长什么样子

Hadoop Client Rpc 通信

client采用适配器模式,通过Proxy代理实现Protobuf通信要求的MethodDescriptor

代理细节

    ClientNamenodeProtocolPB proxy = RPC.getProtocolProxy(
        ClientNamenodeProtocolPB.class, version, address, ugi, conf,
        NetUtils.getDefaultSocketFactory(conf),
        org.apache.hadoop.ipc.Client.getTimeout(conf), defaultPolicy,
        fallbackToSimpleAuth, alignmentContext).getProxy();

client 发送请求细节

    public Message invoke(Object proxy, final Method method, Object[] args)
            throws ServiceException {
            ... ...
      val = (RpcWritable.Buffer) client.call(RPC.RpcKind.RPC_PROTOCOL_BUFFER,
              constructRpcRequest(method, theRequest), remoteId,
              fallbackToSimpleAuth, alignmentContext);
            ... ...
    }

Hadoop Server Rpc通信

server是典型的reactor设计模式,多路复用响应请求,这块具体不赘述,主要看服务端怎么注册服务响应,以及处理响应Handler的部分Rpc通信实现

注册 Protocol (即响应服务)

      this.serviceRpcServer = new RPC.Builder(conf)
          .setProtocol(
              org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB.class)
          .setInstance(clientNNPbService)
          .setBindAddress(bindHost)
          .setPort(serviceRpcAddr.getPort()).setNumHandlers(serviceHandlerCount)
          .setVerbose(false)
          .setSecretManager(namesystem.getDelegationTokenSecretManager())
          .build();
          ... ...
      DFSUtil.addPBProtocol(conf, TraceAdminProtocolPB.class,
          traceAdminService, serviceRpcServer);

Reactor 模式 Handler的接受请求细节 call.run()

        ProtoClassProtoImpl protocolImpl = getProtocolImpl(server, protoName,
            clientVersion);
        BlockingService service = (BlockingService) protocolImpl.protocolImpl;
        MethodDescriptor methodDescriptor = service.getDescriptorForType()
            .findMethodByName(methodName);
        if (methodDescriptor == null) {
          String msg = "Unknown method " + methodName + " called on " + protocol
              + " protocol.";
          LOG.warn(msg);
          throw new RpcNoSuchMethodException(msg);
        }
        Message prototype = service.getRequestPrototype(methodDescriptor);
        Message param = request.getValue(prototype);

        Message result;
        Call currentCall = Server.getCurCall().get();
        try {
          ... ...
          result = service.callBlockingMethod(methodDescriptor, null, param);

由此可见,hadoop在protobuf的基础上做了很多自己的封装。Protobuf 除了实现消息的最简化通信之外,还提供了客户端服务端的接口,整体通信的流程大致可以划分为客户端传递,服务端注册服务,服务接受客户端消息并且转接对应服务。


提示:正文开始

一、Protobuf是什么?

Protobuf是Google用于序列化数据的框架,比xml,json定义的序列化形式占的字节更小、更快、更简单,方便传输。protobuf支持C++,java等多种语言以支持不同语言环境。官网链接


二、Protobuf实现(Local)

备注:Local即本地实现
先从官网的教程样例开始,如何定制一款自定义的消息体结构,并且在本地完成字节流文件输出以及字节流读取结构,实现消息的传递获取。

1.定义.proto文件

.proto文件主要定义了你希望传递的数据结构,以及后面生成什么样的接口类用于实现,官网有数据结构的变量定义以及支持的变量类型
.proto 文件 支持的变量定义 链接
.proto 文件支持的变量类型 链接

示例文件 addressbook.proto

// See README.txt for information and build instructions.

package tutorial;

option java_package = "com.laozhaer.tutorial.protogen";
option java_outer_classname = "AddressBookProtos";
option java_multiple_files = true;

message Person {
  required string name = 1;
  required int32 id = 2;        // Unique ID number for this person.
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phone = 4;
}

// Our address book file is just one of these.
message AddressBook {
  repeated Person person = 1;
}

option: java的参数选项,后面的proto执行器生成代码需要
outer_classname:主要文件生成名
multiple_files:是否生成多个类,比如上述文件就会生成Person和AddressBook类
Person 消息体有自己的姓名,id,邮件和手机号(消息体类型)
AddressBook 消息体记载了对应的 Person

protobuf java rpc 实战_第1张图片

2.生成代码

.proto相当于定义了数据结构,我们需要对应的代码实现类去操作和定义我们的数据。
这里官网提供了.proto文件的代码生成器,可以方便快速地生成proto操作类,因为之前看的hadoop源码对应的protobuf版本是2.5.6,所以我这里选择的版本比较老,大家可以自行去官网选择下载。

PowerShell 命令行执行代码

./protoc.exe --java_out=${output_directory} addressbook.proto

只要在output_directory填写输出代码路径,就可以输出对应的proto执行类,具体类名可以看上图。

3.写出文件

每个message(消息体)都可以视为一个原型,buidler是原型的构建器。
因为Person里面内嵌了PhoneNumber,所以构建Person消息体的同时,需要构建PhoneNumber消息体。

  1. 新建Person原型的Builder,为builder添加数据
    Person.Builder person = Person.newBuilder();
    stdout.print("Enter person ID: ");
    person.setId(Integer.valueOf(stdin.readLine()));
    stdout.print("Enter name: ");
    person.setName(stdin.readLine());
    
  2. 构建PhoneNumber消息体数据,并且为Person附加
      Person.PhoneNumber.Builder phoneNumber =
        Person.PhoneNumber.newBuilder().setNumber(number);
    
      stdout.print("Is this a mobile, home, or work phone? ");
      String type = stdin.readLine();
      if (type.equals("mobile")) {
        phoneNumber.setType(Person.PhoneType.MOBILE);
      } else if (type.equals("home")) {
        phoneNumber.setType(Person.PhoneType.HOME);
      } else if (type.equals("work")) {
        phoneNumber.setType(Person.PhoneType.WORK);
      } else {
        stdout.println("Unknown phone type.  Using default.");
      }
    
      person.addPhone(phoneNumber);
    
    最后buidler.build() 返回Person结构体数据
  3. 为AddressBook添加Person结构体数据,并通过writeTo写出到标准输出流
    AddressBook.Builder addressBook = AddressBook.newBuilder();
    ... ...
    addressBook.addPerson(
      PromptForAddress(new BufferedReader(new InputStreamReader(System.in)),
                       System.out));
    ... ...                   
    FileOutputStream output = new FileOutputStream("hadoop-rpc/target/ADDRESS_BOOK_FILE");
    addressBook.build().writeTo(output);
    
    最后 ADDRESS_BOOK_FILE便是输出的字节流文件
    protobuf java rpc 实战_第2张图片

三、Protobuf Message基础操作(Local)

Message是消息体的父类

method 操作说明
AddressBook#newBuilder() 构建AddressBook的消息体
AddressBook#parseFrom() 从流、字节数组解析成消息体
AbstractMessage.Builder#mergeFrom() 从输入流合并消息体(字节流不全对应一个消息体)
AbstractMessageLite#writeTo 写到输出流

注:只要是不能直接解析成消息体的字节流,都会在Builder中进行截取解析或者合并解析


四、Protobuf 实现 (Stub+Server)

备注:Stub指的是客户端存根,Server指的是服务端,Protobuf提供了客户端和服务端的实现

1.流程图

简单说一下整体流程
client 可以通过 new Stub ,连接channel对Server发送请求。
channel 是 包装 request 数据的地方,类型是Message,这里不需要知道Message 的具体类,只需要传递给服务端执行即可。
服务端需要判断Message的类型和执行方法,并且找到对应的service(预注册)去委托Handler执行,最后由Handler处理数据返回给客户端,客户端通过Channel反序列化获取服务端Response。
protobuf java rpc 实战_第3张图片

2.包含客户端实现的.proto文件生成接口类

package rpctest;

option java_package = "com.laozhaer.tutorialweb.protogen.msg";
option java_generic_services = true;
option java_generate_equals_and_hash = true;
option java_multiple_files = true;


message HelloRequest {
    required string msg = 1;
}

message HelloResponse {
    required string msg = 2;
}

service HelloService {
    rpc greet(HelloRequest) returns (HelloResponse);
}

如果需要.proto文件生成Service类,则需要添加 option java_generic_services = true;
通过protoc.exe即可生成Service文件类,client通过greet来发送请求,server通过委托给handler,由handler执行greet完成请求的回复。

protobuf java rpc 实战_第4张图片

3.具体实现以及源码解析

  1. 客户端实现:

        HelloService.Stub clientStub = HelloService.newStub(channel);
        clientStub.greet(controller,request,done);
    

    客户端只需要完成channel的实现以及发送greet即可完成服务器的交互。其中channel只是一个隧道,具体的数据序列化和反序列方式需要用户自己实现。
    clientStub.greet本质上是委托给channel去执行的

    clietStub.greet 内部细节

        public  void greet(
        com.google.protobuf.RpcController controller,
        com.laozhaer.tutorialweb.protogen.msg.HelloRequest request,
        com.google.protobuf.RpcCallback<com.laozhaer.tutorialweb.protogen.msg.HelloResponse> done) {
      channel.callMethod(
        getDescriptor().getMethods().get(0),
        controller,
        request,
        com.laozhaer.tutorialweb.protogen.msg.HelloResponse.getDefaultInstance(),
        com.google.protobuf.RpcUtil.generalizeCallback(
          done,
          com.laozhaer.tutorialweb.protogen.msg.HelloResponse.class,
          com.laozhaer.tutorialweb.protogen.msg.HelloResponse.getDefaultInstance()));
    }
    }
    
  2. Channel实现
    Channel隧道的实现本质是通过新建socket来完成序列与反序列化的,这里需要注意的一点是发送数据包对于服务端来说,并不知道字节流的完整长度,以及选择什么Handler去对应操作,所以需要在Channel进行发送数据 Header和Body的构建。一般是字节数组前4位(即一个int的长度)表示下一段数据(即header)需要接受的长度。再由header得知下一段数据(即body)需要接受的长度。这段数据体包裹可以直观从下图进行理解:
    protobuf java rpc 实战_第5张图片
    注:header也是一个message结构,主要是告知服务端,服务端需要响应的servicehandler以及执行handler的哪个方法

    header.proto

    
    package rpctest;
    
    option java_package = "com.laozhaer.tutorialweb.protogen.header";
    option java_generic_services = true;
    option java_generate_equals_and_hash = true;
    option java_multiple_files = true;
    
    
    message HelloHeader {
        required string serviceName = 1;
        required string methodName = 2;
        required int32 msgSize = 3;
    }
    

    Channel 的代码实现

    public class RpcChannelImpl implements RpcChannel {
    
    /**
     * here needn't to know Message Instance Type
     * @param helloRequest
     * @return
     */
    public HelloHeader createHeader(Message helloRequest){
        // fetch msg byte size
        int msgSize = helloRequest.getSerializedSize();
    
        HelloHeader.Builder builder = HelloHeader.newBuilder();
    
        return builder.setMsgSize(msgSize).setMethodName("greet").setServiceName("HelloService").build();
    
    }
    
    /**
     * here already know type
     * @param method
     * @param controller
     * @param request
     * @param responsePrototype
     * @param done
     */
    @Override
    public void callMethod(Descriptors.MethodDescriptor method, RpcController controller, Message request, Message responsePrototype, RpcCallback<Message> done) {
    
        HelloHeader requestHeader = createHeader(request);
        int headerSize = requestHeader.getSerializedSize();
        // token, always 4 bit save int
        byte[] tokenBytes = ByteUtils.int2Byte(headerSize);
        // header
        byte[] headerBytes = requestHeader.toByteArray();
        System.out.println("headerSize:"+headerSize);
        // request
        byte[] sendRequestBytes = request.toByteArray();
        int sendRequestSize = request.getSerializedSize();
        System.out.println("requestSize:"+sendRequestSize);
        System.out.println("byte length"+sendRequestBytes.length);
    
        ByteBuffer buffer=ByteBuffer.allocate(4+headerSize+sendRequestSize);
        buffer.put(tokenBytes).put(headerBytes).put(sendRequestBytes);
        
    
        try {
            Socket client = new Socket("localhost", 8080);
            DataInputStream inputStream = new DataInputStream(client.getInputStream());
            OutputStream outputStream =new DataOutputStream(client.getOutputStream());
            outputStream.write(buffer.array());
            outputStream.flush();
            byte[] message = new byte[1024];
            int len = inputStream.read(message);
            byte[] readInform = Arrays.copyOf(message,len);
            Message response = responsePrototype.getParserForType().parseFrom(readInform);
            done.run(response);
        } catch (IOException e) {
            e.printStackTrace();
        }
    
        System.out.println(method.getFullName());
    
    
    }
    }
    

    Channel需要注意的是发送和收回都是确定的类型,只有在发送数据时,需要4位字节+header进行信息标识。
    在响应response时,可以直接读取字节流进行结构体解析,因为只有服务端才会有判断和委托的功能,客户端是点对点对应的。
    protobuf java rpc 实战_第6张图片

  3. 服务端的Handler适配
    服务端在响应前,会预先注册所有的服务句柄,这里只有service.greet操作,所以只需要注册一个即可。
    对于客户端Stub传递过来的Message信息,服务端需要找到对应的handler去执行。这里就需要先前header返回的service和method信息辅助服务端进行委托。

    委托代码

    public byte[] handleRequestData(byte[] data,Socket socket) throws InvalidProtocolBufferException {
        byte[] tokenBytesAfterSend = new byte[4];
        ByteBuffer wrap = ByteBuffer.wrap(data);
        //fetch token bytes array
        wrap.get(tokenBytesAfterSend, 0, 4);
        int headerSize = ByteUtils.byte2Int(tokenBytesAfterSend);
    
        //fetch serviceName & methodName from header
        byte[] headerBytes = new byte[headerSize];
        wrap.get(headerBytes, 0, headerSize);
        HelloHeader helloHeaderProto = HelloHeader.parseFrom(headerBytes);
        String serviceName = helloHeaderProto.getServiceName();
        String methodName = helloHeaderProto.getMethodName();
        System.out.println("serviceName is: "+serviceName);
        System.out.println("methodName is: "+methodName);
        //fetch request message
        int requestSize = helloHeaderProto.getMsgSize();
        byte[] requestBytes = new byte[requestSize];
        wrap.get(requestBytes,0,requestSize);
    
        //Here recognize service
        Service service = serviceManager.get(serviceName);
        if (service!=null){
            Descriptors.MethodDescriptor methodDescriptor = service.getDescriptorForType().findMethodByName(methodName);
            //fetch protoType
            Message prototype = service.getRequestPrototype(methodDescriptor);
            //fetch request
            Message requestMsg = prototype.newBuilderForType().mergeFrom(requestBytes).build();
            //build responseBuilder
            final Message.Builder responseBuilder =
                    service.getResponsePrototype(methodDescriptor).newBuilderForType();
    
            service.callMethod(methodDescriptor,null,requestMsg,new RpcCallback<Message>() {
                @Override
                public void run(Message message) {
                    if (message != null) {
                        responseBuilder.mergeFrom(message);
                    }
                }
            });
            return responseBuilder.build().toByteArray();
    
    
        }
        return null;
    };
    
    

    大致流程是服务端获取了客户端的数据包,对客户端数据包前四位进行解析,获取下一段字节流长度得到header的Message,通过对header的Message解析获得body的Message,并且通过header携带的信息进行methodDescriptor的指定(这样service就知道需要找handler的greet操作回应客户端的greet),最后通过service.callMethod发送body Message到Handler对应的方法去执行。这里handler直接在callback发送文字给客户端进行响应,省略了服务端的执行操作。

    Handler的代码

    public class RpcHandlerImpl implements HelloService.Interface {
    @Override
    public void greet(RpcController controller, HelloRequest request, RpcCallback<HelloResponse> done) {
    
        done.run(HelloResponse.newBuilder().setMsg("you success!").build());
        System.out.println("doSomething");
        }
    }
    

    注:Server是全程不知道且不应该去向下转型的,只能是Message类。不应该在Server中直接对Message进行向下转型,而是在handler中完成直接操作,这里的response也是参照了hbase的rpc,在callback中进行回调,这样就避免了response Message的强制转型。

    Hbase源码

    final Message.Builder responseBuilder =
      service.getResponsePrototype(methodDesc).newBuilderForType();
    service.callMethod(methodDesc, controller, request, new RpcCallback<Message>() {
      @Override
      public void run(Message message) {
        if (message != null) {
          responseBuilder.mergeFrom(message);
        }
      }
    });
    
    if (coprocessorHost != null) {
      coprocessorHost.postEndpointInvocation(service, methodName, request, responseBuilder);
    }
    IOException exception =
      org.apache.hadoop.hbase.ipc.CoprocessorRpcUtils.getControllerException(controller);
    if (exception != null) {
      throw exception;
    }
    

五、Protobuf Service 具体操作 (Server)

method 具体说明
HelloService#newReflectiveService 添加服务Handler
Service#getDescriptorForType 获得服务Handler句柄
ServiceDescriptor#findMethodByName 获得服务Handler的方法句柄

callMethod中除了实现了方法的调用,还可以指定Controller和Callback(客户端和服务端都有)
Controller是为了获得状态的返回,Callback主要是针对状态进行对应的善后操作(客户端关闭连接;服务端响应客户端)

Controller抽象定义(这里就不再实现了)

public interface RpcController {
  // -----------------------------------------------------------------
  // These calls may be made from the client side only.  Their results
  // are undefined on the server side (may throw RuntimeExceptions).

  /**
   * Resets the RpcController to its initial state so that it may be reused in
   * a new call.  This can be called from the client side only.  It must not
   * be called while an RPC is in progress.
   */
  void reset();

  /**
   * After a call has finished, returns true if the call failed.  The possible
   * reasons for failure depend on the RPC implementation.  {@code failed()}
   * most only be called on the client side, and must not be called before a
   * call has finished.
   */
  boolean failed();

  /**
   * If {@code failed()} is {@code true}, returns a human-readable description
   * of the error.
   */
  String errorText();

  /**
   * Advises the RPC system that the caller desires that the RPC call be
   * canceled.  The RPC system may cancel it immediately, may wait awhile and
   * then cancel it, or may not even cancel the call at all.  If the call is
   * canceled, the "done" callback will still be called and the RpcController
   * will indicate that the call failed at that time.
   */
  void startCancel();

  // -----------------------------------------------------------------
  // These calls may be made from the server side only.  Their results
  // are undefined on the client side (may throw RuntimeExceptions).

  /**
   * Causes {@code failed()} to return true on the client side.  {@code reason}
   * will be incorporated into the message returned by {@code errorText()}.
   * If you find you need to return machine-readable information about
   * failures, you should incorporate it into your response protocol buffer
   * and should NOT call {@code setFailed()}.
   */
  void setFailed(String reason);

  /**
   * If {@code true}, indicates that the client canceled the RPC, so the server
   * may as well give up on replying to it.  This method must be called on the
   * server side only.  The server should still call the final "done" callback.
   */
  boolean isCanceled();

  /**
   * Asks that the given callback be called when the RPC is canceled.  The
   * parameter passed to the callback will always be {@code null}.  The
   * callback will always be called exactly once.  If the RPC completes without
   * being canceled, the callback will be called after completion.  If the RPC
   * has already been canceled when NotifyOnCancel() is called, the callback
   * will be called immediately.
   *
   * 

{@code notifyOnCancel()} must be called no more than once per request. * It must be called on the server side only. */ void notifyOnCancel(RpcCallback<Object> callback); }

callMethod调用Handler句柄,Handler里面进行操作,伪代码实现如下:

    @Override
    public void greet(RpcController controller, HelloRequest request, RpcCallback<HelloResponse> done) {

        ... ...
        do(things)
        ... ...
        if things.getWrong() {
            controller.setFailed();
        }
        done.run(new Thread( () -> System.out.println("callback logic") ).start(););
        System.out.println("doSomething");
    }

六、完整代码

资源链接

你可能感兴趣的:(框架,java,rpc,hadoop)