Ubuntu Linux下安装 TensorFlow等开发环境

1. Basic support
    [required] sudo apt-get install build-essential
    [required] sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
    [optional] sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev libgtkglext1-dev libgtk-3-dev
    [optional] sudo apt install libcanberra-gtk-module libcanberra-gtk3-module


2. support OPENMP
    sudo apt install libomp-dev

3. support WayLand
    sudo nano /etc/gdm3/custom.conf
    >> WaylandEnable=true
    sudo systemctl restart gdm3

4. support SCR
    >> tesseract是一个开源的OCR引擎,最初是由惠普公司开发用来作为其平板扫描仪的 OCR引擎,2005年惠普将其开源出来,之后google接手负责维护
    sudo add-apt-repository ppa:alex-p/tesseract-ocr
    sudo apt-get update 
    sudo apt-get install tesseract-ocr
    >> 字库下载 : tesseract支持60多种语言的识别不同,使用之前需要先下载对应语言的字库,下载地址:https://github.com/tesseract-ocr/tessdata
    latest update : 2021 .    5 years ago generically 
    >> 下载完成之后把.traineddata字库文件放到tessdata目录下,默认路径是/usr/share/tesseract-ocr /4.00/tessdata
    wget https://gitcode.net/mirrors/tesseract-ocr/tessdata/-/archive/4.1.0/tessdata-4.1.0.tar.gz
    tar xf tessdata-4.1.0.tar.gz
    sudo cp -a tessdata-4.1.0/*.traineddata /usr/share/tesseract-ocr/4.00/tessdata/

5.  Install Intel@ openCL
    ( https://github.com/intel/compute-runtime/releases )
    mkdir neo && cd neo
    # Download all *.deb packages
    wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.14062.11/intel-igc-core_1.0.14062.11_amd64.deb
    wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.14062.11/intel-igc-opencl_1.0.14062.11_amd64.deb
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/intel-level-zero-gpu-dbgsym_1.3.26516.18_amd64.ddeb
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/intel-level-zero-gpu_1.3.26516.18_amd64.deb
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/intel-opencl-icd-dbgsym_23.22.26516.18_amd64.ddeb
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/intel-opencl-icd_23.22.26516.18_amd64.deb
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/libigdgmm12_22.3.0_amd64.deb 
    # Verify sha256 sums for packages
    wget https://github.com/intel/compute-runtime/releases/download/23.22.26516.18/ww22.sum
    sha256sum -c ww22.sum
    # Install all packages as root
    sudo dpkg -i *.deb
    # In case of installation problems, please install required dependencies, for example:
    sudo apt install ocl-icd-libopencl1


6.  install Intel@ oneAPI MKL (Math Kernel Library)
    6.1 Online Installer
    ( https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html )
        # download the key to system keyring
        wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
        # add signed entry to apt sources and configure the APT client to use Intel repository:
        echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
        sudo apt update
        sudo apt install intel-basekit 
        # Intel® oneAPI HPC Toolkit
        sudo apt install intel-hpckit
        # Intel® oneAPI IoT Toolkit
        sudo apt install intel-iotkit
        # Intel® oneAPI DL Framework Developer Toolkit
        sudo apt install intel-dlfdkit 
        # Intel® AI Analytics Toolkit
        sudo apt install intel-aikit
        # Intel® oneAPI Rendering Toolkit
        sudo apt install intel-renderkit

    6.2 Offline Installer
    ( https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html )
        +------------------------------------------------------------------------
        | Recommended for host machines with poor or no internet connection
        | Size     913.25 MB
        | Version     2023.2.0
        | Date     July 13, 2023
        | SHA384     f5cc20cdd92ab961693c7649fb0b046937ae8aae92eb1464090a187816e7bad3ccd6ef5bf90924226d5f4d1314fe57ab
        +------------------------------------------------------------------------
        wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/adb8a02c-4ee7-4882-97d6-a524150da358/l_onemkl_p_2023.2.0.49497_offline.sh
        sudo sh ./l_onemkl_p_2023.2.0.49497_offline.sh
 
7. Build open-cv with opencv_contrib
    # Install minimal prerequisites (Ubuntu 18.04 as reference)
    sudo apt update && sudo apt install -y cmake g++ wget unzip
    # Download and unpack sources
    wget -O opencv.zip https://github.com/opencv/opencv/archive/4.x.zip
    wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/4.x.zip
    unzip opencv.zip
    unzip opencv_contrib.zip
    # Create build directory and switch into it
    mkdir -p build && cd build
    # Configure
    cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib-4.x/modules ../opencv-4.x
    cmake -DCMAKE_BUILD_TYPE=Debug -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib-4.x/modules ../opencv-4.x
    # Build
    cmake --build . -j 8
    or 
    make -j8

8. Install TensorFlow (https://tensorflow.google.cn/install/source?hl=zh-cn)
    pip install tensorflow -i https://pypi.tuna.tsinghua.edu.cn/simple
    or
    pip install tensorflow==2.13 -i https://pypi.tuna.tsinghua.edu.cn/simple
    In order to run facenet, install tensorflow 2.7.3
    pip install tensorflow==2.7.3 -i https://pypi.tuna.tsinghua.edu.cn/simple
    
    
    It will be installed on ~/.local/lib/python3.8/site-packages/tensorflow
    So project needs to use ~/.local/lib/python3.8/site-packages/tensorflow/include as include directory.
    use ~/.local/lib/python3.8/site-packages/tensorflow as lib path
    
    cd ~/.local/lib/python3.8/site-packages/tensorflow
    ln -s libtensorflow_cc.so.2        libtensorflow_cc.so
    ln -s libtensorflow_framework.so.2 libtensorflow_framework.so

    cd ~/.local/lib/python3.8/site-packages/numpy.libs
    ln -s libopenblas64_p-r0-15028c96.3.21.so  libopenblas.so
    ln -s libquadmath-96973f99.so.0.0.0        libquadmath.so

9. Compile TenserFlow C++ library (https://blog.csdn.net/MOU_IT/article/details/87976152)
    9.1 install protobuf
        wget https://github.com/protocolbuffers/protobuf/releases/download/v3.7.1/protobuf-cpp-3.7.1.tar.gz
        tar -xzvf protobuf-cpp-3.7.1.tar.gz
        sudo apt-get install automake libtool
        ./autogen.sh
        ./configure
        make
        sudo make install
        sudo ldconfig
        # sudo make uninstall 安装错版本后卸载指令
        protoc --version  # 查看protobuf版本
        
    9.2 Install bazel 
        +--------------------------------+--------------------------+
        |    tensorflow 2.15.0 rc        |    Bazel 6.1.0           |
        |    tensorflow 2.13.0           |    Bazel 5.3.0           |
        +--------------------------------+--------------------------+
        bazel是Google开源的一套编译构建工具,广泛应用于Google内部,包括TensorFlow项目。
        # prepare tools
        sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python
        # download bazel 5.3.0
#        wget https://github.com/bazelbuild/bazel/releases/download/6.1.0/bazel-6.1.0-installer-linux-x86_64.sh
        wget https://github.com/bazelbuild/bazel/releases/download/5.3.0/bazel-5.3.0-installer-linux-x86_64.sh
        sudo chmod +x bazel-5.3.0-installer-linux-x86_64.sh
        # install bazel to $HOME/.bazel/bin
        ./bazel-5.3.0-installer-linux-x86_64.sh --user
        export PATH="$PATH:$HOME/bin"
        
    9.3 downlaod tenserFlow
        git clone --recursive https://github.com/tensorflow/tensorflow.git
        cd tensorflow
        # switch to the branch you want to build
        git checkout r2.13  # r1.9, r1.10, etc.
        
    9.4 build tenserFlow library (libtensorflow_cc.so & libtensorflow_cc_framework.so)
        ./configure
        bazel build --config=opt                       //tensorflow:libtensorflow_cc.so
        bazel build --config=opt --config=cuda         //tensorflow:libtensorflow_cc.so # no NVidia display card
        bazel build -c opt --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-mavx2 --copt=-mfma //tensorflow/tools/pip_package:build_pip_package

        
        bazel build //tensorflow/tools/pip_package:build_pip_package
        bazel-bin/tensorflow/tools/pip_package/build_pip_package package/20230912
        pip uninstall  tensorflow
        pip install package/20230912/tensorflow-*.whl
        
    9.5 Install eigen(https://eigen.tuxfamily.org/index.php?title=Main_Page)
    
        wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.bz2
        ## git clone https://github.com/eigenteam/eigen-git-mirror  ##  3.3.99 cause compile error
        git clone https://gitlab.com/libeigen/eigen.git
        #安装
        wget  https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.bz2
        tar xf eigen-3.4.0.tar.bz2
        cd eigen-3.4.0
        cmake .. && sudo make install
        # 安装后,头文件安装在/usr/local/include/eigen3/
        # 很多程序中include时 经常使用#include
        #                  而不是使用#include
        #                  所以要做下处理
        sudo ln -s /usr/local/include/eigen3/Eigen /usr/local/include/Eigen
    9.6 Install ml_types (https://pypi.org/project/ml-dtypes/) (https://github.com/jax-ml/ml_dtypes)
        ml_dtypes is a stand-alone implementation of several NumPy dtype extensions used in machine learning libraries, 
        sudo apt install python3-pip
        sudo pip install ml_dtypes
        
        /usr/local/lib/python3.8/dist-packages/third_party/eigen/Eigen
        +--------------------------------------------------solve compile error ----------------------------------------------+
        | [ 50%] Building CXX object CMakeFiles/tf_test.dir/src/hello.cpp.o 
        | In file included from /home/rd/tensorflow/tensorflow/core/platform/float8.h:19,
        |          from /home/rd/tensorflow/tensorflow/core/platform/types.h:20,
        |          from /home/rd/tensorflow/tensorflow/core/platform/env_time.h:20,
        |          from /home/rd/tensorflow/tensorflow/core/platform/env.h:26,
        |          from /home/rd/tensorflow-test-prog/src/hello.cpp:1:
        |        /home/rd/tensorflow/tensorflow/tsl/platform/float8.h:19:10: fatal error: include/float8.h:没有那个文件或目录 
        |    19 | #include "include/float8.h"  // from @ml_dtypes 
        |       |          ^~~~~~~~~~~~~~~~~~
        | compilation terminated.
        +--------------------------------------------------------------------------------------------------------------------+
        
    9.7 Install absl
        wget https://github.com/abseil/abseil-cpp/archive/refs/tags/20230125.3.tar.gz
        tar  xf 20230125.3.tar.gz
        cd abseil-cpp-20230125.3
        mkdir build
        cd build
        cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local
        make -j8
        sudo make install
        +--------------------------------------------------solve compile error ----------------------------------------------+
        | In file included from /home/rd/tensorflow/tensorflow/core/platform/cord.h:19,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/tstring.h:19,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/types.h:22,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/env_time.h:20,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/env.h:26,
        |                  from /home/rd/tensorflow-test-prog/src/hello.cpp:1:
        | /home/rd/tensorflow/tensorflow/tsl/platform/cord.h:21:10: fatal error: absl/strings/cord.h: 没有那个文件或目录
        |    21 | #include "absl/strings/cord.h"  // IWYU pragma: export
        |       |          ^~~~~~~~~~~~~~~~~~~~~
        +--------------------------------------------------------------------------------------------------------------------+
        
    9.8 Install protoc & protobuf 
        sudo apt-get install autoconf automake libtool curl make g++ unzip
        wget https://github.com/protocolbuffers/protobuf/releases/download/v24.2/protobuf-24.2.tar.gz
        tar xf protobuf-24.2.tar.gz
        cd protobuf-24.2
        cp -a abseil-cpp-20230125.3  thirdparty/abseil-cpp
        mkdir build && cd build
        cmake -Dprotobuf_BUILD_TESTS=OFF ..
        make -j8
        sudo make install
        sudo ldconfig # refresh shared library cache.
         -- or --
        wget https://github.com/protocolbuffers/protobuf/releases/download/v24.2/protoc-24.2-linux-x86_64.zip
        unzip protoc-24.2-linux-x86_64.zip
        sudo cp bin/protoc /usr/local/bin/
        sudo cp -a include/google /usr/local/include/
        +--------------------------------------------------solve compile error ----------------------------------------------+
        |     In file included from /home/rd/tensorflow/tensorflow/tsl/platform/status.h:39,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/status.h:23,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/errors.h:27,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/env.h:27,
        |                  from /home/rd/tensorflow-test-prog/src/hello.cpp:1:
        | /home/rd/tensorflow/tensorflow/tsl/protobuf/error_codes.pb.h:11:10: fatal error: google/protobuf/port_def.inc: 没有那个文件或目录
        |    11 | #include "google/protobuf/port_def.inc"
        |       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        | compilation terminated.
        | make[2]: *** [CMakeFiles/tf_test.dir/build.make:63:CMakeFiles/tf_test.dir/src/hello.cpp.o] 错误 1
        | make[1]: *** [CMakeFiles/Makefile2:76:CMakeFiles/tf_test.dir/all] 错误 2
        | make: *** [Makefile:84:all] 错误 2
        +--------------------------------------------------------------------------------------------------------------------+
        
    9.9    Build error_codes.pb.h & error_codes.pb.cc and so on
        cd ~/tensorflow
        ls tensorflow/tsl/protobuf
            bfc_memory_map.proto  coordination_config.proto   distributed_runtime_payloads.proto  error_codes.proto  rpc_options.proto  test_log.proto
            BUILD                 coordination_service.proto  dnn.proto                           histogram.proto    status.proto
        protoc  --cpp_out=.  tensorflow/tsl/protobuf/*.proto
        ls tensorflow/tsl/protobuf/
            bfc_memory_map.pb.cc       coordination_config.proto           distributed_runtime_payloads.proto  error_codes.proto  rpc_options.proto  test_log.proto
            bfc_memory_map.pb.h        coordination_service.pb.cc          dnn.pb.cc                           histogram.pb.cc    status.pb.cc
            bfc_memory_map.proto       coordination_service.pb.h           dnn.pb.h                            histogram.pb.h     status.pb.h
            BUILD                      coordination_service.proto          dnn.proto                           histogram.proto    status.proto
            coordination_config.pb.cc  distributed_runtime_payloads.pb.cc  error_codes.pb.cc                   rpc_options.pb.cc  test_log.pb.cc
            coordination_config.pb.h   distributed_runtime_payloads.pb.h   error_codes.pb.h                    rpc_options.pb.h   test_log.pb.h
            
        protoc  --cpp_out=.  tensorflow/*/*/*.proto
        protoc  --cpp_out=.  tensorflow/*/*/*/*.proto
        +--------------------------------------------------solve compile error ----------------------------------------------+
        |             [ 50%] Building CXX object CMakeFiles/tf_test.dir/src/hello.cpp.o
        | In file included from /home/rd/tensorflow/tensorflow/core/platform/status.h:23,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/errors.h:27,
        |                  from /home/rd/tensorflow/tensorflow/core/platform/env.h:27,
        |                  from /home/rd/tensorflow-test-prog/src/hello.cpp:1:
        | /home/rd/tensorflow/tensorflow/tsl/platform/status.h:39:10: fatal error: tensorflow/tsl/protobuf/error_codes.pb.h: 没有那个文件或目录
        |    39 | #include "tensorflow/tsl/protobuf/error_codes.pb.h"
        |       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        | compilation terminated.
        | make[2]: *** [CMakeFiles/tf_test.dir/build.make:63:CMakeFiles/tf_test.dir/src/hello.cpp.o] 错误 1
        | make[1]: *** [CMakeFiles/Makefile2:76:CMakeFiles/tf_test.dir/all] 错误 2
        | make: *** [Makefile:84:all] 错误 2
        +--------------------------------------------------------------------------------------------------------------------+
        [ 50%] Building CXX object CMakeFiles/tf_test.dir/src/hello.cpp.o
        | In file included from /home/rd/tensorflow-test-prog/src/hello.cpp:2:
        | /home/rd/tensorflow/tensorflow/core/public/session.h:24:10: fatal error: tensorflow/core/framework/device_attributes.pb.h: 没有那个文件或目录
         |   24 | #include "tensorflow/core/framework/device_attributes.pb.h"
        |       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        | compilation terminated.
        | make[2]: *** [CMakeFiles/tf_test.dir/build.make:63:CMakeFiles/tf_test.dir/src/hello.cpp.o] 错误 1
        | make[1]: *** [CMakeFiles/Makefile2:76:CMakeFiles/tf_test.dir/all] 错误 2
        | make: *** [Makefile:84:all] 错误 2
        +--------------------------------------------------------------------------------------------------------------------+
        
    9.10    # create symbolic link to libtensorflow_framework.so.2.15.0
        cd ~/tensorflow/bazel-bin/tensorflow
        ln -s libtensorflow_framework.so.2.15.0 libtensorflow_framework.so
        ln -s libtensorflow_framework.so.2.15.0 libtensorflow_framework.so.2
            lrwxrwxrwx  1 rd rd        33 9月   6 16:43 libtensorflow_framework.so -> libtensorflow_framework.so.2.15.0
            lrwxrwxrwx  1 rd rd        33 9月   6 16:43 libtensorflow_framework.so.2 -> libtensorflow_framework.so.2.15.0
        +--------------------------------------------------solve compile error ----------------------------------------------+
        |     -- Build files have been written to: /home/rd/tensorflow-test-prog
        | [ 50%] Building CXX object CMakeFiles/tf_test.dir/src/hello.cpp.o
        | [100%] Linking CXX executable tf_test
        | /usr/bin/ld: 找不到 -ltensorflow_framework
        | collect2: error: ld returned 1 exit status
        | make[2]: *** [CMakeFiles/tf_test.dir/build.make:84:tf_test] 错误 1
        | make[1]: *** [CMakeFiles/Makefile2:76:CMakeFiles/tf_test.dir/all] 错误 2
        | make: *** [Makefile:84:all] 错误 2s
        +--------------------------------------------------------------------------------------------------------------------+

10.  compile FaceRecognition_MTCNN_FaceNet-master
        git clone https://github.com/Chanstk/FaceRecognition_MTCNN_FaceNet.git
        cd FaceRecognition_MTCNN_FaceNet
        mkdir build && cd build
        cmake ..  && make
        
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  : error adding symbols: DSO missing from command line
        cause    : shared lib called by main calls another shared lib, but CMakeLists.txt has not add -llibother.so
        solution : CXX_FLAGS += -Wl,--copy-dt-needed-entries
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  : /usr/bin/ld: warning: libmkl_intel_lp64.so.2, needed by /usr/local/lib/libopencv_core.so.4.8.0, not found (try using -rpath or -rpath-link)
        solution : CMakeLists.txt : link_directories( /opt/intel/oneapi/mkl/2023.2.0/lib/intel64 )
         +--------------------------------------------------------------------------------------------------------------------+
         problem  : 2023-09-08 14:13:58.777362: E /home/rd/NN/FaceRecognition_MTCNN_FaceNet-master/src/main.cpp:107] Read proto
         solution : vim ../src/main.cpp 
                    - string graph_path = "./model/20170323-142841.pb";
                    + string graph_path = "./model/mtcnn_frozen_model.pb";
         +--------------------------------------------------------------------------------------------------------------------+
        problem  : has not enablesd AVX2 FMA
            2023-09-08 14:16:17.991925: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
            2023-09-08 14:16:18.070278: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
            To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
            2023-09-08 14:16:18.218639: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
            terminate called after throwing an instance of 'cv::Exception'
              what():  OpenCV(4.8.0) /home/rd/opencv/opencv-4.8.0/modules/core/src/arithm.cpp:652: error: (-215:Assertion failed) type2 == CV_64F && (sz2.height == 1 || sz2.height == 4) in function 'arithm_op'
        solution : mkdir model; cp ../../facenet-compare-cpp/facenet-compare-cpp-master/models/mtcnn_frozen_model.pb  model/
         +--------------------------------------------------------------------------------------------------------------------+
         (https://www.intel.com/content/www/us/en/developer/articles/guide/optimization-for-tensorflow-installation-guide.html)
         wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh
        sudo chmod +x Anaconda3-2023.07-2-Linux-x86_64.sh 
        sudo ./Anaconda3-2023.07-2-Linux-x86_64.sh 

         sudo conda install intel-tensorflow -c intel
         
         [ Fail ] bazel build  --cxxopt=-D\_GLIBCXX\_USE\_CXX11\_ABI=0 --copt=-march=corei7-avx --copt=-mtune=core-avx-i --copt=-O3 --copt=-Wformat --copt=-Wformat-security --copt=-fstack-protector --copt=-fPIC --copt=-fpic --linkopt=-znoexecstack --linkopt=-zrelro --linkopt=-znow --linkopt=-fstack-protector  //tensorflow/tools/pip_package:build_pip_package
         
         [ OK ] bazel clean
                bazel build -c opt --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-mavx2 --copt=-mfma //tensorflow/tools/pip_package:build_pip_package


11.  test facenet
        git clone https://github.com/davidsandberg/facenet.git

11.1 align dataset

        for N in {1..4}; do \
        python3 src/align/align_dataset_mtcnn.py \
        ~/datasets/lfw/raw \
        ~/datasets/lfw/lfw_mtcnnpy_160 \
        --image_size 160 \
        --margin 32 \
        --random_order \
        --gpu_memory_fraction 0.25 \
        & done

        (https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw)
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    AttributeError: module ‘tensorflow‘ has no attribute 'GPUOptions'
                    AttributeError: module 'tensorflow' has no attribute 'Session'
                    AttributeError: module 'tensorflow' has no attribute 'variable_scope'
        cause    :    Tensorflow 1.X和 2.X不兼容。
        Solution :    [-]sed -i "s/tf.GPUOptions/tf.compat.v1.GPUOptions/g"    */*.py  */*/*.py  */*/*/*.py
                    [-]sed -i "s/tf.Session/tf.compat.v1.Session/g"          */*.py  */*/*.py  */*/*/*.py
                    [-]sed -i "s/tf.ConfigProto/tf.compat.v1.ConfigProto/g"  */*.py  */*/*.py  */*/*/*.py
                    
                    [+]sed -i "s/import tensorflow as tf/import tensorflow.compat.v1 as tf/g"  */*.py */*/*.py */*/*/*.py

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    ValueError: Object arrays cannot be loaded when allow_pickle=False
        cause    :    Tensorflow 1.X和 2.X不兼容。
        Solution :    1.  pip install numpy=1.16.2
                    2.  vim src/align/detect_face.py +85
                        - data_dict = np.load(data_path, encoding='latin1'                   ).item() #pylint: disable=no-member
                        + data_dict = np.load(data_path, encoding='latin1', allow_pickle=True).item() #pylint: disable=no-member

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    AtributeError: 'int' object has no attribute 'value'
        cause    :    Tensorflow 1.X和 2.X不兼容。
        Solution :    vim src/align/detect_face.py +194
                    - feed_in, dim = (inp, input_shape[-1].value)
                    + feed_in, dim = (inp, input_shape[-1])

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    AttributeError: scipy.misc is deprecated and has no attribute imread
                    AttributeError: scipy.misc is deprecated and has no attribute imresize.
        cause    :    官方scipy中提到,imread is deprecated! imread is deprecated in SciPy 1.0.0, and will be removed in 1.2.0. Use imageio.imread instead
        Solution :    sudo pip3 install imageio
                    sed -i "s/misc.im/imageio.im/g"  */*.py  */*/*.py
                    sed -i "s/from scipy import misc/import imageio.v2 as imageio/g"  */*.py  */*/*.py   */*/*/*.py
                    vim src/align/align_dataset_mtcnn.py
                    + import imageio
                    - img = misc.imread(image_path)
                    + img = imageio.imread(image_path)
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    AttributeError: module 'imageio.v2' has no attribute 'imresize'    
        Solution :    sed -i "s/imageio.imresize(cropped,/Image.fromarray(cropped).resize(/g"  */*.py  */*/*.py
                    sed -i "s/imageio.imresize(img,/Image.fromarray(img).resize(/g"  */*.py  */*/*.py   */*/*/*.py
                    sed -i "s/import imageio.v2 as imageio/import imageio.v2 as imageio\nfrom PIL import Image/g"  */*.py  */*/*.py   */*/*/*.py
                    sed -i "s/interp='bilinear'/Image.BILINEAR/g"  */*.py  */*/*.py
        +--------------------------------------------------solve compile error ----------------------------------------------+
#        sudo apt install libjpeg-dev  libtiff-dev
#        sudo pip install imageio iio  # image I/O
        
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    div (from tensorflow.python.ops.math_ops) is deprecated and will be removed
        Solution :    https://docs.w3cub.com/tensorflow~1.15/div.html
                    1.  vim src/align/detect_face.py +214
                        - tf.div
                        + tf.compat.v1.div\
                
    11.2 validate on lfw dataset
            python src/validate_on_lfw.py \
            ~/datasets/lfw/lfw_mtcnnpy_160 \
            ~/models/facenet/20180402-114759 \
            --distance_metric 1 \
            --use_flipped_images \
            --subtract_mean \
            --use_fixed_image_standardization        
        +--------------------------------------------------solve compile error ----------------------------------------------+
        ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions.


        WARNING:tensorflow:From /home/rd/NN/facenet/src/facenet.py:112: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.


        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    WARNING:tensorflow:From /home/rd/NN/facenet/src/facenet.py:112: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
                    Instructions for updating:
                    tf.py_func is deprecated in TF V2. Instead, there are two
                        options available in V2.
                        - tf.py_function takes a python function which manipulates tf eager
                        tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
                        an ndarray (just call tensor.numpy()) but having access to eager tensors
                        means `tf.py_function`s can use accelerators such as GPUs as well as
                        being differentiable using a gradient tape.
                        - tf.numpy_function maintains the semantics of the deprecated tf.py_func
                        (it is not differentiable, and manipulates numpy arrays). It drops the
                        stateful argument making all functions stateful.
        Solution :     vim src/facenet.py +112
        

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    WARNING:tensorflow:From /home/rd/NN/facenet/src/facenet.py:131: batch_join (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
                    Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.interleave(...).batch(batch_size)` (or `padded_batch(...)` if `dynamic_pad=True`).
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    WARNING:tensorflow:From /home/rd/.local/lib/python3.8/site-packages/tensorflow/python/training/input.py:738: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
                    Instructions for updating:
                    To construct input pipelines, use the `tf.data` module.
                    WARNING:tensorflow:From /home/rd/.local/lib/python3.8/site-packages/tensorflow/python/training/input.py:738: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
                    Instructions for updating:
                    To construct input pipelines, use the `tf.data` module.
                    
        sudo pip install tf-slim
        sed -i "s/import tensorflow.contrib.slim as slim/import tf_slim as slim/g" */*.py  */*/*.py
        
    11.3 compare images 
            python3  src/compare.py ~/NN/facenet-pre-trained-model/20180408-102900/20180408-102900.pb \
                                    ~/datasets/lfw/Tony_Liu/Tony_Liu_0001.jpg \
                                    ~/datasets/lfw/lfw_mtcnnpy_160/Tony_Blair/Tony_Blair_0001.png \
                                    ~/opencv/study/video/Face-Samples/stark-face/stark-6.png \
                                    ~/opencv/study/video/Face-Samples/stark-face/stark-2.png \
                                    ~/opencv/study/video/Face-Samples/tony-face/tony-2.png \
                                       ~/opencv/study/video/Face-Samples/tony-face/tony-5.png

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    raise ValueError("Expect x to not have duplicates")
                    ValueError: Expect x to not have duplicates
            ERROR: Could not find a version that satisfies the requirement numpy<1.23.0,>=1.16.5 (from scipy==1.7.1) (from versions: none)
            ERROR: No matching distribution found for numpy<1.23.0,>=1.16.5 (from scipy==1.7.1)

        cause    : scipy包下得interpolate.interp1d()函数问题
        requirement : numpy<1.23.0,>=1.16.5 (from scipy==1.7.1)
        solution :降低scipy版本,我的是scipy(1.10.1)版本现换为版本1.7.1,可行!
                    pip uninstall numpy scipy
                    pip install scipy==1.7.1  
                    
        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    ValueError: Node 'gradients/InceptionResnetV1/Bottleneck/BatchNorm/cond/FusedBatchNorm_1_grad/FusedBatchNormGrad' has an _output_shapes attribute inconsistent with the GraphDef for output #3: Dimension 0 in both shapes must be equal, but are 0 and 512. Shapes are [0] and [512]
        cause    :    refer to https://github.com/davidsandberg/facenet/issues/1227 / https://github.com/openvinotoolkit/openvino/pull/11078
        Solution :    For those who undergo this problem. I would suggest following actions:
                    Add the directive "import tensorflow.compat.v1 as tf" to the corresponding .py files.
                    Use the arguments to specify the model file and pair.txt with absolute full path, as following,

                python3 FaceNet/src/validate_on_lfw.py ../Inventory/Aligned /Users/xxxx/Projects/Inventory/Models/20180402-114759.pb --distance_metric 1 --use_flipped_images --subtract_mean --use_fixed_image_standardization --lfw_pairs /Users/xxxx/Projects//FaceNet/data/pairs.txt

        +--------------------------------------------------solve compile error ----------------------------------------------+
        problem  :    2023-09-13 17:14:46.844967: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
                    Traceback (most recent call last):
                      File "src/compare.py", line 131, in
                        main(parse_arguments(sys.argv[1:]))
                      File "src/compare.py", line 42, in main
                        images = load_and_align_data(args.image_files, args.image_size, args.margin, args.gpu_memory_fraction)
                      File "src/compare.py", line 111, in load_and_align_data
                        prewhitened = facenet.prewhiten(aligned)
                      File "/home/rd/NN/facenet/src/facenet.py", line 220, in prewhiten
                        y = np.multiply(np.subtract(x, mean), 1/std_adj)
                    ValueError: operands could not be broadcast together with shapes (160,160,3) (2,) 
        solution :    vim src/facenet.py +220
                    std_adj = np.maximum(std, 1.0/np.sqrt(np.prod(x.size)))


 

你可能感兴趣的:(linux,ubuntu,tensorflow)