NameError: name ‘CUDA_RUNTIME_LIB‘ is not defined

NameError: name ‘CUDA_RUNTIME_LIB‘ is not defined_第1张图片

WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
Traceback (most recent call last):
  File "finetune.py", line 6, in
    import bitsandbytes as bnb
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 7, in
    from .autograd._functions import (
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/__init__.py", line 1, in
    from ._functions import undo_layout, get_inverse_transform_indices
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 9, in
    import bitsandbytes.functional as F
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/functional.py", line 17, in
    from .cextension import COMPILED_WITH_CUDA, lib
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 13, in
    setup.run_cuda_setup()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 101, in run_cuda_setup
    binary_name, cudart_path, cuda, cc, cuda_version_string = evaluate_cuda_setup()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 382, in evaluate_cuda_setup
    cudart_path = determine_cuda_runtime_lib_path()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 247, in determine_cuda_runtime_lib_path
    CUDASetup.get_instance().add_log_entry(f'{candidate_env_vars["CONDA_PREFIX"]} did not contain '
NameError: name 'CUDA_RUNTIME_LIB' is not defined
Traceback (most recent call last):
  File "finetune.py", line 6, in
    import bitsandbytes as bnb
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 7, in
    from .autograd._functions import (
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/__init__.py", line 1, in
    from ._functions import undo_layout, get_inverse_transform_indices
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 9, in
    import bitsandbytes.functional as F
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/functional.py", line 17, in
    from .cextension import COMPILED_WITH_CUDA, lib
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 13, in
    setup.run_cuda_setup()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 101, in run_cuda_setup
    binary_name, cudart_path, cuda, cc, cuda_version_string = evaluate_cuda_setup()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 382, in evaluate_cuda_setup
    cudart_path = determine_cuda_runtime_lib_path()
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py", line 247, in determine_cuda_runtime_lib_path
    CUDASetup.get_instance().add_log_entry(f'{candidate_env_vars["CONDA_PREFIX"]} did not contain '
NameError: name 'CUDA_RUNTIME_LIB' is not defined
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 372341) of binary: /home/gaosong/anaconda3/envs/vicuna8/bin/python
Traceback (most recent call last):
  File "/home/gaosong/anaconda3/envs/vicuna8/bin/torchrun", line 8, in
    sys.exit(main())
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/run.py", line 762, in main
    run(args)
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
    elastic_launch(
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
finetune.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2023-06-08_15:31:06
  host      : server
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 372342)
  error_file:
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-06-08_15:31:06
  host      : server
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 372341)
  error_file:
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

这个错误原因是查找cuda安装目录报的错

如果可以直接找到,就不会报这个错了

echo $CONDA_PREFIX 可以看到目录位置在

cd $CONDA_PREFIX/lib

检查是否存在

ls libcudart.so.11.0, 提醒不存在, 不要问我为什么叫  libcudart.so.11.0, 这个是我本机其它环境有的这个版本, 而且其它环境可用的

sudo find / -name 'libcudart.so.11.0'

找到此文件,复制到 $CONDA_PREFIX/lib 目录

我的目录是 

cp /work1/home/gaosong/anaconda3/envs/gpt/lib/libcudart.so.11.0 $CONDA_PREFIX/lib

然后接着报错

CUDA SETUP: CUDA runtime path found: /home/gaosong/anaconda3/envs/vicuna8/lib/libcudart.so.11.0

CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x 

尝试这个版本

cd $CONDA_PREFIX/lib
rm -rf libcudart.so.11.0 
cp /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudart.so.12 ./
mv libcudart.so.12 libcudart.so.12.0

# 升级到0.38.0 此处报错, 注释掉 

# if USE_8bit is True:

#     assert bnb.__version__ >= '0.37.2', "Please downgrade bitsandbytes's version, for example: pip install bitsandbytes==0.37.2"

       

你可能感兴趣的:(各种问题,python,深度学习,chinese-vicuna)