因为只有Jetpack3.0包含cuda8.0和cudnn5.1,而Jetpack3.0只能运行在ubuntu14.04上(亲测16.04有问题会导致失败)
保证宿主机和tx2在同一局域网下,按照官方文档和Jetpak的提示执行即可,很简单。
注意事项:
(1)如果在host上安装cuda toolkit出现错误(类似held package等)的时候,换一下ubuntu的更新源(推荐163源)就可以了。
(2)期间如果出现host机提示determining IP …. (卡很久)之类的提示,可以关掉,然后重新来,flash system image那个选择no action,之后就会出现手动配置板子的IP和用户密码的选项。
我编译好的whl文件 —— 百度网盘 提取码:n1sz
可以尝试直接使用我编译好的whl文件(直接跳到 安装whl文件 这一步)
不保证可用,保险起见如果有时间的话最好还是执行下面的步骤。
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer -y
sudo apt-get install zip unzip autoconf automake libtool curl zlib1g-dev maven -y
sudo apt install python3-numpy python3-dev python3-pip python3-wheel
bazel_version=0.5.1
wget https://github.com/bazelbuild/bazel/releases/download/$bazel_version/bazel-$bazel_version-dist.zip
unzip bazel-$bazel_version-dist.zip -d bazel-dist
sudo chmod -R ug+rwx bazel-dist
cd bazel-dist
#编译这一步很耗时,约20~30分钟
./compile.sh
sudo cp output/bazel /usr/local/bin
cd ~
git clone --recursive https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout v1.2.0
#注意是eigen_archive节点
native.new_http_archive(
name = "eigen_archive",
urls = [
"http://mirror.bazel.build/bitbucket.org/eigen/eigen/get/d781c1de9834.tar.gz",
"https://bitbucket.org/eigen/eigen/get/d781c1de9834.tar.gz",
],
sha256 = "a34b208da6ec18fa8da963369e166e4a368612c14d956dd2f9d7072904675d9b",
strip_prefix = "eigen-eigen-d781c1de9834",
build_file = str(Label("//third_party:eigen.BUILD")),
)
export PYTHON_BIN_PATH=$(which python3)
# No Google Cloud Platform support
export TF_NEED_GCP=0
# No Hadoop file system support
export TF_NEED_HDFS=0
# Use CUDA
export TF_NEED_CUDA=1
# Setup gcc ; just use the default
export GCC_HOST_COMPILER_PATH=$(which gcc)
# TF CUDA Version
export TF_CUDA_VERSION=8.0
# CUDA path
export CUDA_TOOLKIT_PATH=/usr/local/cuda
# cuDNN
export TF_CUDNN_VERSION=5.1.10
export CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu
# CUDA compute capability
export TF_CUDA_COMPUTE_CAPABILITIES=6.2
export CC_OPT_FLAGS=-march=native
export TF_NEED_JEMALLOC=1
export TF_NEED_OPENCL=0
export TF_ENABLE_XLA=1
#这一步有提示的话,直接enter就行,不用理
./configure
#这一步巨巨巨耗时!约3小时
bazel build -c opt --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
mv /tmp/tensorflow_pkg/tensorflow-1.2.0*-linux_aarch64.whl ~
#有可能提示版本问题,参考错误提示修改
#这步也很耗时,需要耐心等待
pip3 install ~/tensorflow-1.2.0-cp35-cp35m-linux_aarch64.whl
安装这一步会在“Running setup.py bdist_wheel for numpy …”卡住很久,需要一点耐心
#!/usr/bin/env python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
运行:
python3 testtf.py
结果:
nvidia@tegra-ubuntu:~$ python3 testtf.py
2017-12-26 02:30:00.977979: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:879] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2017-12-26 02:30:00.978096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GP10B
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 3.97GiB
2017-12-26 02:30:00.978144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-12-26 02:30:00.978174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-12-26 02:30:00.978204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GP10B, pci bus id: 0000:00:00.0)
2017-12-26 02:30:00.978237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:642] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0. Your kernel may not have been built with NUMA support.
2017-12-26 02:30:02.406679: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-12-26 02:30:02.406746: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
2017-12-26 02:30:02.407489: I tensorflow/compiler/xla/service/service.cc:198] XLA service 0x29c9470 executing computations on platform Host. Devices:
2017-12-26 02:30:02.407540: I tensorflow/compiler/xla/service/service.cc:206] StreamExecutor device (0): ,
2017-12-26 02:30:02.408398: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-12-26 02:30:02.408446: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
2017-12-26 02:30:02.409154: I tensorflow/compiler/xla/service/service.cc:198] XLA service 0x2a193b0 executing computations on platform CUDA. Devices:
2017-12-26 02:30:02.409199: I tensorflow/compiler/xla/service/service.cc:206] StreamExecutor device (0): GP10B, Compute Capability 6.2
b'Hello, TensorFlow!'
参考文献:https://docs.opencv.org/3.2.0/d6/d15/tutorial_building_tegra_cuda.html
wget https://github.com/opencv/opencv/archive/3.2.0.zip
unzip 3.2.0.zip
cd opencv-3.2.0/
sudo apt-get install \
cmake \
libglew-dev \
libtiff5-dev \
zlib1g-dev \
libjpeg-dev \
libpng12-dev \
libjasper-dev \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libpostproc-dev \
libswscale-dev \
libeigen3-dev \
libtbb-dev \
libgtk2.0-dev \
pkg-config
mkdir build
cd build
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr \
-DBUILD_PNG=OFF \
-DBUILD_TIFF=OFF \
-DBUILD_TBB=OFF \
-DBUILD_JPEG=OFF \
-DBUILD_JASPER=OFF \
-DBUILD_ZLIB=OFF \
-DBUILD_EXAMPLES=ON \
-DBUILD_opencv_java=OFF \
-DBUILD_opencv_python2=ON \
-DBUILD_opencv_python3=ON \
-DBUILD_PYTHON_SUPPORT=ON \
-DENABLE_PRECOMPILED_HEADERS=OFF \
-DWITH_OPENCL=OFF \
-DWITH_OPENMP=OFF \
-DWITH_FFMPEG=ON \
-DWITH_GSTREAMER=OFF \
-DWITH_GSTREAMER_0_10=OFF \
-DWITH_CUDA=ON \
-DWITH_GTK=ON \
-DWITH_VTK=OFF \
-DWITH_TBB=ON \
-DWITH_1394=OFF \
-DWITH_OPENEXR=OFF \
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-8.0 \
-DCUDA_ARCH_BIN=6.2 \
-DCUDA_ARCH_PTX="" \
-DINSTALL_C_EXAMPLES=ON \
-DINSTALL_TESTS=OFF \
-DOPENCV_TEST_DATA_PATH=../opencv_extra/testdata \
..
#编译需要一个小时左右
make -j4
sudo make install
作者:yanjie
2017.12.26