准备写几个文章来记录对tensorflow代码的阅读。本文主要写tensorflow代码阅读准备及tensorflow-serving代码的阅读。
代码阅读前,还是需要准备一下装备来提升代码阅读效率,好的工具能提升代码阅读的效率,这里推荐使用CLion来阅读代码。
即使windows下,也可以用CLion阅读代码,虽然很多代码编译不过,但并不影响代码的阅读。
首先,我们需要选择一个阅读的版本,本文选择1.15版本。在github上建立代码库tf_read,然后在该目录下下载tensorflow/tensorflow和tensorflow/serving的代码(直接下载对应的tar包解压就行)。目前tf_read目录下的情况如下:
为了提升使用CLion阅读代码的时候效率,需要提前做一些工作,包括:生成pb产物、移除单测文件、屏蔽“无关”文件、添加构建目标。
CLion代码分析时,会查找对应的头文件,如果proto编译产物不存在,则会影响代码的解析,也会影响代码跳转的准确率。
这时,我们需要手工编译一下proto文件,但是需要将编译产物放在特定的目录下,并将这个目录从全文搜索的路径里移除,避免搜索proto成员时出现一堆编译产物的搜索结果。
在根目录下创建pb_out目录用于存放proto编译产出。在serving-1.15.0目录下建立一个指向tensorflow-1.15.0/tensorflow的软链,因为serving下的部分proto依赖了tensorflow的proto。分别在serving-1.15.0和tensorflow-1.15.0目录下执行下面代码,进行手工编译:
protoc --cpp_out=../pb_out `find . -name '*.proto'`
然后在serving-1.15.0目录下单独执行grpc的编译命令:
protoc --grpc_out=. --plugin=protoc-gen-grpc=`which grpc_cpp_plugin` `find tensorflow_serving/apis/ -name '*.proto'`
上面命令执行完毕后,选中pb_out目录,右键-将目录标记为-排除。接着在CMakeLists.txt中添加include_directories(pb_out)
。
由于tensorflow和serving是使用bazel构建的,单侧文件和源码文件都是放在一个目录里,影响CLion的全文搜索和“查找用法”等功能。这时建议把"_test.cc"和"_benchmark.cc"文件移动到特定目录下。
在根目录下建立test_dir目录,然后编写脚本将这两类文件移动到这个目录下:
for filename in `find . -name '*_test.cc' -o -name '*_benchmark.cc'`
do
echo ${filename};
origin_dir=`dirname ${filename}`;
target_dir="test_dir/${origin_dir}";
echo ${target_dir};
mkdir -p ${target_dir};
cp ${filename} ${target_dir};
done
执行上面的脚本后,单侧和压测代码就被移动到test_dir目录下,然后右键-将目录标记为-排除,这样就排除单侧对代码跳转的影响了。
为了减少全文搜索时无关代码对结果的影响,建议将py、java、go、lite等相关代码目录直接排除(右键-将目录标记为-排除)。
tensorflow是使用bazel构建的,虽然CLion有bazel插件,但是bazel经常崩溃,体验不是特别好,所以还是使用cmake来构建这个代码。
在CMakeLists.txt里添加下面代码,来添加构建目标(不是真的构建,只是用来阅读一下代码):
set(CMAKE_CXX_STANDARD 17)
include_directories(tensorflow-1.15.0)
include_directories(serving-1.15.0)
aux_source_directory(serving-1.15.0/tensorflow_serving/apis SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/batching SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/config SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/core SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/model_servers SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/resources SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/servables/hashmap SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/servables/tensorflow SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/sources/storage_path SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util/net_http/client SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util/net_http/compression SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util/net_http/internal SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util/net_http/server/internal SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/util/net_http/server/public SERVER_SRC)
aux_source_directory(serving-1.15.0/tensorflow_serving/mytools SERVER_SRC)
add_executable(tf-server ${SERVER_SRC})
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/client TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/framework TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/gradients TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/ops TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/profiler TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/saved_model TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/tools TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/cc/training TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/api_def TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/common_runtime TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/common_runtime/data TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/common_runtime/eager TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/common_runtime/gpu TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/common_runtime/sycl TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/debug TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/distributed_runtime TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/distributed_runtime/eager TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/distributed_runtime/rpc TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/framework TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/graph TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/clusters TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/costs TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/graph_analyzer TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/inputs TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/optimizers TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/utils TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/grappler/verifiers TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/batching_util TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/boosted_trees TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/boosted_trees/quantiles TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/data TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/fuzzing TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/hexagon TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/neon TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/rnn TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/kernels/tensor_forest TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/lib TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/lib/core TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/nccl TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/ops TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/ops/compat TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/cloud TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/default TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/hadoop TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/posix TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/profile_utils TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/platform/s3 TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/profiler TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/profiler/internal TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/profiler/lib TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/profiler/rpc TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/protobuf TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/summary TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/tpu TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/user_ops TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util/ctc TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util/proto TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util/rpc TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util/sparse TF_SRC)
aux_source_directory(tensorflow-1.15.0/tensorflow/core/util/tensor_bundle TF_SRC)
add_library(tf ${TF_SRC})
部分第三方库不在源码中,这时需要手工安装一下。mac用户可以直接用brew进行安装,同时在CMakeLists.txt里添加include_directories(/usr/local/include)
; windows用户直接使用conan安装一下即可,conanfile.txt内容如下:
[build_requires]
[requires]
protobuf/3.9.1
gflags/2.2.2
glog/0.5.0
abseil/20211102.0
eigen/3.4.0
[generators]
cmake_find_package
cmake_paths
[options]
然后在CMakeLists.txt中添加下面内容:
include(${CMAKE_BINARY_DIR}/conan_paths.cmake)
find_package(absl REQUIRED)
find_package(Protobuf REQUIRED)
find_package(Eigen3 REQUIRED)
include_directories(${absl_INCLUDE_DIR})
include_directories(${protobuf_INCLUDE_DIR})
get_filename_component(Eigen3_INCLUDE_DIR_P ${Eigen3_INCLUDE_DIR} DIRECTORY)
include_directories(${Eigen3_INCLUDE_DIR_P})
使用CLion看tensorflow的准备工作就做完了。
tensorflow可以分为下面几层:
tensorflow_serving/model_servers
,提供grpc接口tensorflow_serving/core
管理模型的版本tensorflow_serving/servables/tensorflow
,模型的一个版本。执行的流程: service层接到请求,根据模型名在aspired_versions_manager中找到对应的servable,然后执行servable中session的RUN操作,完成推理过程,最后将结果返回。
service层提供了grpc三套接口:
service层除了提供grpc接口外,还提供了http接口,可以直接post json的形式进行在线推理。
同时还有prometheus接口,用于监控。
以预测为例,调用过程如下:
这一层是serving管理模型数据的层。serving支持单模型和多模型两种版本。
单模型即整个服务里就一个模型的数据,多模型即服务里包含了多个模型的数据。
serving模型加载也分为两种模式,静态加载和动态加载。
静态加载即服务启动时加载模型数据,然后模型文件更新也不会去加载,代码在tensorflow_serving/core/static_manager.h,但似乎没有使用。
动态加载即服务时加载模型,模型数据更新后自动加载新模型。
模型加载的策略包括:
可以配置load和unload的线程数控制装载和卸载的效率。
获取servable的流程:
其中HandlesMap是unordered_multimap,key是ServableRequest,value是LoaderHarness。
ServableRequest中不包含版本信息,通过制定hash函数使得每次查询获得都是最新的版本。
ServerCore下的组件:
todo: 详细分析资源管理层逻辑
一个servable对应一个模型的一个版本,这里涉及到多个类:
他们的关系是: manager中存了SharedPtrHandle对象,通过ServableRequest查询对应的SharedPtrHandle;使用时将获取的使用ServableHandle将获取的SharedPtrHandle对象中的Loader维护的SavedModelBundle取出。
再往下就是tensorflow的代码了。