Tensorflow SavedModelBuilder bug 解决

之前训练文件的导出和freeze都是使用tf.train.Saver()方法,这次为了适配 Tensorflow Serving 使用了tf.saved_model.builder.SavedModelBuilder()的方法. 经过一天的尝试,解决了 builder的save和load.在本机导出再载入做inference没问题,但是将模型部署到serving下就碰到了如下问题(粘贴为引用格式太乱了,就粘贴为代码块啦…):

Traceback (most recent call last):
  File "/home/max/code/tf-serving/serving/tensorflow_serving/example/inception_client_gold.py", line 64, in 
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "/home/max/code/tf-serving/serving/tensorflow_serving/example/inception_client_gold.py", line 59, in main
    result = stub.Predict(request, 10.0)  # 10 secs timeout
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
    self._request_serializer, self._response_deserializer)
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="NodeDef mentions attr 'dilations' not in Op output:T; attr=T:type,allowed=[DT_HALF, DT_FLOAT]; attr=strides:list(int); attr=use_cudnn_on_gpu:bool,default=true; attr=padding:string,allowed=["SAME", "VALID"]; attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>; NodeDef: MobileNet/conv_1/Conv2D = Conv2D[T=DT_FLOAT, _output_shapes=[[1,112,112,32]], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_images_0_0, MobileNet/conv_1/weights/read). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
     [[Node: MobileNet/conv_1/Conv2D = Conv2D[T=DT_FLOAT, _output_shapes=[[1,112,112,32]], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_images_0_0, MobileNet/conv_1/weights/read)]]")

分析原因,是不同版本tensorflow之间(version,有无GPU),导致的不兼容问题. 如果把tensorflow都升到最新还是没解决的话,可以在导出时,加入strip_default_attrs=True参数.
加在SavedModelBuilder.add_meta_graph_and_variables()或者SavedModelBuilder.add_meta_graph()中,像下面这个样子:

builder = tf.saved_model.builder.SavedModelBuilder(saved_model_dir)
builder.add_meta_graph_and_variables(sess, ["serve"], signature_def_map={
    "model": tf.saved_model.signature_def_utils.build_signature_def(
        inputs={"images": input_tensor_info},
        outputs={"scores": output_tensor_info},
        method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)
}, strip_default_attrs=True)
builder.save()

参考 https://github.com/tensorflow/tensorflow/issues/14884
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md

你可能感兴趣的:(tensorflow)