解决......lib/include/THC/THCGeneral.h:12:18: fatal error: cuda.h: No such file or directory报错问题

文章目录

  • 1. 问题描述
    • 1.1 构建的环境:完全按照要求
    • 1.2 编译出错的具体情况
      • 1.2.1 编译make.sh前必要的修改
      • 1.2.2 报错信息
  • 2.解决方法
    • 2.1 我的解决方法
    • 2.2 总结

1. 问题描述

本人在编译Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector一文的开源代码https://github.com/fanq15/FSOD-code时遇到了报错:
......lib/include/THC/THCGeneral.h:12:18:fatal error: cuda.h: No such file or directory

1.1 构建的环境:完全按照要求

pytorch0.4.1
torchvision>=0.2.0
cython
matplotlib
numpy
scipy
opencv
pyyaml
3.12
packaging
pandas
pycocotools
CUDA=9.0

  1. 本人使用实验室公用的工作站,所以我不能把CUDA9.0安装在usr/local/下(以免影响他人使用特定版本的CUDA),只能安装在我的/home目录下:
    CUDA一般的安装位置:/usr/local/cuda-9.0/
    我的CUDA安装位置:/home/hongze/cudas/cuda-9.0/
  2. 安装多个CUDA之后一定要查看nvcc的版本:
(base) hongze@lab-PowerEdge-T630 ~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
  1. 如果版本不是需要的版本,需要在.bashrc文件更改,例如我安装了两个版本的CUDA:
(base) hongze@lab-PowerEdge-T630 ~/cudas$ ls
cuda-10.1  cuda9  include  lib64  NVIDIA_CUDA-10.1_Samples  NVIDIA_CUDA-9.0_Samples  src
  1. 我通过更改.bashrc文件中这三行来决定我启用哪个版本的CUDA(我没有使用软连接,而是直接指向了真实的目录位置;如果使用软连接则需要更改软连接来实现:查看这条博客)

#我的.bashrc的更改之前的关键三行:
#把cuda-9.0改成cuda-10.0,然后source一下.bashrc文件,将终端关闭重新打开,再查看nvcc版本,就变成了10.0

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/hongze/cudas/cuda-9.0/lib64
export PATH=$PATH:/home/hongze/cudas/cuda-9.0/bin
export CUDA_HOME=$CUDA_HOME:/home/hongze/cudas/cuda-9.0

1.2 编译出错的具体情况

1.2.1 编译make.sh前必要的修改

  • 基本上make.sh编译前需要检查CUDA_ARCH是否与你的显卡适配:详情看这个博客或者nvidia官网的信息;
  • 如果你的CUDA的软连接无效了或者像我一样不使用软链接,则需要修改CUDA_PATH;
#make.sh文件的内容
#!/usr/bin/env bash

#CUDA_PATH=/usr/local/cuda/
CUDA_PATH=/home/hongze/cudas/cuda9#我修改了我的CUDA_PATH

export CXXFLAGS="-std=c++11"
export CFLAGS="-std=c99"

python3 setup.py build_ext --inplace
rm -rf build

# Choose cuda arch as you need
CUDA_ARCH="-gencode arch=compute_30,code=sm_30 \
           -gencode arch=compute_35,code=sm_35 \
           -gencode arch=compute_50,code=sm_50 \
           -gencode arch=compute_52,code=sm_52 \
           -gencode arch=compute_60,code=sm_60 \
           -gencode arch=compute_61,code=sm_61 \
           -gencode arch=compute_70,code=sm_70 "#我使用的显卡是TITAN XP

# compile NMS
cd model/nms/src
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu \
	 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH

cd ../
python3 build.py

# compile roi_pooling
cd ../../
cd model/roi_pooling/src
echo "Compiling roi pooling kernels by nvcc..."
nvcc -c -o roi_pooling.cu.o roi_pooling_kernel.cu \
	 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python3 build.py

# # compile roi_align
# cd ../../
# cd model/roi_align/src
# echo "Compiling roi align kernels by nvcc..."
# nvcc -c -o roi_align_kernel.cu.o roi_align_kernel.cu \
# 	 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
# cd ../
# python3 build.p

# compile roi_crop
cd ../../
cd model/roi_crop/src
echo "Compiling roi crop kernels by nvcc..."
nvcc -c -o roi_crop_cuda_kernel.cu.o roi_crop_cuda_kernel.cu \
	 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python3 build.py

# compile roi_align (based on Caffe2's implementation)
cd ../../
cd modeling/roi_xfrom/roi_align/src
echo "Compiling roi align kernels by nvcc..."
nvcc -c -o roi_align_kernel.cu.o roi_align_kernel.cu \
	 -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python3 build.py

1.2.2 报错信息

fatal error: cuda.h: No such file or directory一共出现4次,分别是编译NMS、roi_pooling、roi_crop、roi_align四个部分产生的

#执行sh make.sh的报错信息
(FSOD) hongze@lab-PowerEdge-T630 ~/Documents/FSOD-code/lib$ sh make.sh
running build_ext
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/utils
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_bbox.c -o build/temp.linux-x86_64-3.6/utils/cython_bbox.o -Wno-cpp
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/utils
c99 build/temp.linux-x86_64-3.6/utils/cython_bbox.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so
building 'utils.cython_nms' extension
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_nms.c -o build/temp.linux-x86_64-3.6/utils/cython_nms.o -Wno-cpp
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed -std=c99 build/temp.linux-x86_64-3.6/utils/cython_nms.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so -> utils
copying build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so -> utils
Compiling nms kernels by nvcc...
Including CUDA code.
/home/hongze/Documents/FSOD-code/lib/model/nms
['/home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda_kernel.cu.o']
generating /tmp/tmp5qy32gtx/_nms.c
setting the current directory to '/tmp/tmp5qy32gtx'
running build_ext
building '_nms' extension
creating home
creating home/hongze
creating home/hongze/Documents
creating home/hongze/Documents/FSOD-code
creating home/hongze/Documents/FSOD-code/lib
creating home/hongze/Documents/FSOD-code/lib/model
creating home/hongze/Documents/FSOD-code/lib/model/nms
creating home/hongze/Documents/FSOD-code/lib/model/nms/src
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c _nms.c -o ./_nms.o -std=c99
In file included from /home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
                 from _nms.c:570:
/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:12:18: fatal error: cuda.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 51, in _build
    dist.run_command('build_ext')
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "build.py", line 37, in <module>
    ffi.build()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 189, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 111, in _build_extension
    outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/api.py", line 727, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/recompiler.py", line 1555, in recompile
    compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 22, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 58, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1
Compiling roi pooling kernels by nvcc...
Including CUDA code.
/home/hongze/Documents/FSOD-code/lib/model/roi_pooling
generating /tmp/tmpuhxsnvnf/_roi_pooling.c
setting the current directory to '/tmp/tmpuhxsnvnf'
running build_ext
building '_roi_pooling' extension
creating home
creating home/hongze
creating home/hongze/Documents
creating home/hongze/Documents/FSOD-code
creating home/hongze/Documents/FSOD-code/lib
creating home/hongze/Documents/FSOD-code/lib/model
creating home/hongze/Documents/FSOD-code/lib/model/roi_pooling
creating home/hongze/Documents/FSOD-code/lib/model/roi_pooling/src
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c _roi_pooling.c -o ./_roi_pooling.o -std=c99
In file included from /home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
                 from _roi_pooling.c:570:
/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:12:18: fatal error: cuda.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 51, in _build
    dist.run_command('build_ext')
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "build.py", line 35, in <module>
    ffi.build()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 189, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 111, in _build_extension
    outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/api.py", line 727, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/recompiler.py", line 1555, in recompile
    compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 22, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 58, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1
Compiling roi crop kernels by nvcc...
Including CUDA code.
/home/hongze/Documents/FSOD-code/lib/model/roi_crop
generating /tmp/tmp3j3suxnv/_roi_crop.c
setting the current directory to '/tmp/tmp3j3suxnv'
running build_ext
building '_roi_crop' extension
creating home
creating home/hongze
creating home/hongze/Documents
creating home/hongze/Documents/FSOD-code
creating home/hongze/Documents/FSOD-code/lib
creating home/hongze/Documents/FSOD-code/lib/model
creating home/hongze/Documents/FSOD-code/lib/model/roi_crop
creating home/hongze/Documents/FSOD-code/lib/model/roi_crop/src
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c _roi_crop.c -o ./_roi_crop.o -std=c99
In file included from /home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
                 from _roi_crop.c:570:
/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:12:18: fatal error: cuda.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 51, in _build
    dist.run_command('build_ext')
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "build.py", line 36, in <module>
    ffi.build()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 189, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 111, in _build_extension
    outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/api.py", line 727, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/recompiler.py", line 1555, in recompile
    compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 22, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 58, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1
Compiling roi align kernels by nvcc...
Including CUDA code.
/home/hongze/Documents/FSOD-code/lib/modeling/roi_xfrom/roi_align
generating /tmp/tmpfskoao8a/_roi_align.c
setting the current directory to '/tmp/tmpfskoao8a'
running build_ext
building '_roi_align' extension
creating home
creating home/hongze
creating home/hongze/Documents
creating home/hongze/Documents/FSOD-code
creating home/hongze/Documents/FSOD-code/lib
creating home/hongze/Documents/FSOD-code/lib/modeling
creating home/hongze/Documents/FSOD-code/lib/modeling/roi_xfrom
creating home/hongze/Documents/FSOD-code/lib/modeling/roi_xfrom/roi_align
creating home/hongze/Documents/FSOD-code/lib/modeling/roi_xfrom/roi_align/src
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c _roi_align.c -o ./_roi_align.o -std=c99
In file included from /home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
                 from _roi_align.c:570:
/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:12:18: fatal error: cuda.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 51, in _build
    dist.run_command('build_ext')
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "build.py", line 36, in <module>
    ffi.build()
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 189, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 111, in _build_extension
    outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/api.py", line 727, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/recompiler.py", line 1555, in recompile
    compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 22, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/cffi/ffiplatform.py", line 58, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1

2.解决方法

主要要修改的有两个地方:make.sh文件的内容、执行make.sh的命令

2.1 我的解决方法

参考了这条回答:看原出处

Hi all, if you encounter this error on compling fatal error: cuda.h: No such file or directory
Try this:
CPATH=/path/to/you/cuda/include ./make.sh

于是我编译时输入的命令由:sh make.sh改为CPATH=/home/hongze/cudas/cuda9/include/ ./make.sh,编译成功

#编译成功的结果
(FSOD) hongze@lab-PowerEdge-T630 ~/Documents/FSOD-code/lib$ CPATH=/home/hongze/cudas/cuda9/include/ ./make.sh
running build_ext
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/utils
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_bbox.c -o build/temp.linux-x86_64-3.6/utils/cython_bbox.o -Wno-cpp
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/utils
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed build/temp.linux-x86_64-3.6/utils/cython_bbox.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so
building 'utils.cython_nms' extension
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_nms.c -o build/temp.linux-x86_64-3.6/utils/cython_nms.o -Wno-cpp
utils/cython_nms.c: In function ‘__pyx_pf_5utils_10cython_nms_2soft_nms’:
utils/cython_nms.c:3399:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       __pyx_t_10 = ((__pyx_v_pos < __pyx_v_N) != 0);
                                  ^
utils/cython_nms.c:3910:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       __pyx_t_10 = ((__pyx_v_pos < __pyx_v_N) != 0);
                                  ^
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed build/temp.linux-x86_64-3.6/utils/cython_nms.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so -> utils
copying build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so -> utils
running build_ext
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/utils
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_bbox.c -o build/temp.linux-x86_64-3.6/utils/cython_bbox.o -Wno-cpp
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/utils
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed -std=c99 build/temp.linux-x86_64-3.6/utils/cython_bbox.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so
building 'utils.cython_nms' extension
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/numpy/core/include -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c utils/cython_nms.c -o build/temp.linux-x86_64-3.6/utils/cython_nms.o -Wno-cpp
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed -std=c99 build/temp.linux-x86_64-3.6/utils/cython_nms.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.6/utils/cython_bbox.cpython-36m-x86_64-linux-gnu.so -> utils
copying build/lib.linux-x86_64-3.6/utils/cython_nms.cpython-36m-x86_64-linux-gnu.so -> utils
Compiling nms kernels by nvcc...
Including CUDA code.
/home/hongze/Documents/FSOD-code/lib/model/nms
['/home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda_kernel.cu.o']
generating /tmp/tmpw5rm8jqh/_nms.c
setting the current directory to '/tmp/tmpw5rm8jqh'
running build_ext
building '_nms' extension
creating home
creating home/hongze
creating home/hongze/Documents
creating home/hongze/Documents/FSOD-code
creating home/hongze/Documents/FSOD-code/lib
creating home/hongze/Documents/FSOD-code/lib/model
creating home/hongze/Documents/FSOD-code/lib/model/nms
creating home/hongze/Documents/FSOD-code/lib/model/nms/src
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c _nms.c -o ./_nms.o -std=c99
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -fPIC -DWITH_CUDA -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hongze/anaconda3/envs/FSOD/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/hongze/anaconda3/envs/FSOD/include/python3.6m -c /home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda.c -o ./home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda.o -std=c99
gcc -pthread -shared -L/home/hongze/anaconda3/envs/FSOD/lib -Wl,-rpath=/home/hongze/anaconda3/envs/FSOD/lib,--no-as-needed -std=c99 ./_nms.o ./home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda.o /home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda_kernel.cu.o -L/home/hongze/anaconda3/envs/FSOD/lib -lpython3.6m -o ./_nms.so

☆另外,我解决了cuda.h找不到的问题后出现另一个问题:nms_cuda.c: No such file or directory

gcc: error: /home/hongze/Documents/FSOD-code/lib/model/nms/src/nms_cuda.c: No such file or directory

只需要在FSOD-code/lib/model/nms/src/自行加入该文件,点击查看nms_cuda.c文件内容:

This aims to fix issue #17. The compling error is actually caused by missing the file nms_cuda.c. I borrow the one in Detectron.pytorch.

  • 强制使用指定位置的nvcc:https://github.com/jwyang/faster-rcnn.pytorch/issues/55
  • 在make.sh文件前面添加这三行:
#export PATH=home/hongze/cudas/cuda-9.0/bin${PATH:+:${PATH}}
#export CPATH=home/hongze/cudas/cuda-9.0/include${CPATH:+:${CPATH}}
#export LD_LIBRARY_PATH=home/hongze/cudas/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
  • 自行创建软连接连接到cuda.h的:

For me, I don’t have root access. I can’t make such symlinks to the system. So I just symlinked to the problem dir and it works now. Not recommending this, just for those in a similar situation to me with whom no environmental variable setting has helped.

ln -s ~/cuda/include/* ~/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/
  • 有人重新安装了pytorch0.4.0,然后解决了问题

Hi @huinsysu, do you solve the issue, I met the same error as yours. And I reinstall the pytorch0.4.0, then solve the issue.

  • 有人在make.h中指定了CUDA_ARCH的值:

Hope this is not too late. Actually all the solutions mentioned above does not work for me. I solved this by modifying the file ‘~/lib/make.sh’. In the file, please comment the CUDA_ARCH line and changed the ‘$CUDA_ARCH’ in each compilation into ‘-arch=arch’
Taking the nms package as an example, the previous command (line 20)is:
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
Now, we should change it as:
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
Then conduct similar operations for other packages.
It should be noted that ‘arch’ is determined by your GPU type.-arch=sm_52’ works for TitanX. You could look at the Compilation Section in this Page or the documents of the NVCC for information of your GPU.
I am still a bit confused about the problem. I guess it results from the CUDA version.

  • 更多解决方法请查看:
    论文作者的指路,包含了4个Issue页面,点击查看:

You can reference here to find solutions: detectron-tw, stan-dev470, stan-dev622, stan-dev561

2.2 总结

知其然,却不知其所以然。

你可能感兴趣的:(python,cuda,深度学习,linux)