在测试 Chinese-Vicuna 时, 运行如下代码报错
bash scripts/finetune.sh
解决方案:
升级 bitsandbytes 版本到 0.38.0
原因是: 我的显卡 cuda版本是1.21, 而bitsandbytes的0.37.0版本只到1.20
版本修改后,安装一下依赖 pip install -r requirements.txt
然后再运行
bash scripts/finetune.sh
此报错解决
报错如下:
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/gaosong/anaconda3/envs/vicuna8 did not contain libcudart.so as expected! Searching further paths...
warn(msg)
/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/tmp/torchelastic_xzjl0mxr/none__l403kgq/attempt_0/1/error.json')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. CUDA driver not installed
2. CUDA not installed
3. You have multiple conflicting CUDA libraries
4. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
python setup.py install
/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/gaosong/anaconda3/envs/vicuna8 did not contain libcudart.so as expected! Searching further paths...
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/tmp/torchelastic_xzjl0mxr/none__l403kgq/attempt_0/0/error.json')}
warn(msg)
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so================================================ERROR=====================================
CUDA SETUP: Highest compute capability among GPUs detected: 8.0CUDA SETUP: CUDA detection failed! Possible reasons:
CUDA SETUP: Detected CUDA version 1211. CUDA driver not installed
CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?2. CUDA not installed
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...3. You have multiple conflicting CUDA libraries
4. Required library not pre-compiled for this bitsandbytes release!
================================================ERROR=====================================CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: CUDA detection failed! Possible reasons:CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
1. CUDA driver not installed================================================================================
2. CUDA not installed
3. You have multiple conflicting CUDA librariesCUDA SETUP: Something unexpected happened. Please compile from source:
4. Required library not pre-compiled for this bitsandbytes release!git clone [email protected]:TimDettmers/bitsandbytes.git
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.cd bitsandbytes
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.CUDA_VERSION=121
================================================================================python setup.py install
CUDA SETUP: Setup Failed!
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...python setup.py install
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...CUDA SETUP: Detected CUDA version 121
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.soCUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 121
================================================ERROR=====================================CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
CUDA SETUP: CUDA detection failed! Possible reasons:CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
1. CUDA driver not installed
2. CUDA not installed================================================ERROR=====================================
3. You have multiple conflicting CUDA librariesCUDA SETUP: CUDA detection failed! Possible reasons:
4. Required library not pre-compiled for this bitsandbytes release!1. CUDA driver not installed
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.2. CUDA not installed
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.3. You have multiple conflicting CUDA libraries
================================================================================4. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.CUDA SETUP: Something unexpected happened. Please compile from source:
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.git clone [email protected]:TimDettmers/bitsandbytes.git
================================================================================cd bitsandbytes
CUDA_VERSION=121
CUDA SETUP: Something unexpected happened. Please compile from source:python setup.py install
git clone [email protected]:TimDettmers/bitsandbytes.gitCUDA SETUP: Setup Failed!
cd bitsandbytesCUDA SETUP: Something unexpected happened. Please compile from source:
CUDA_VERSION=121git clone [email protected]:TimDettmers/bitsandbytes.git
python setup.py installcd bitsandbytes
CUDA SETUP: Setup Failed!CUDA_VERSION=121
python setup.py install
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. CUDA driver not installed
2. CUDA not installed
3. You have multiple conflicting CUDA libraries
4. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================Traceback (most recent call last):
File "finetune.py", line 6, in
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
python setup.py install
CUDA SETUP: Setup Failed!
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121
python setup.py install
import bitsandbytes as bnb
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 7, in
Traceback (most recent call last):
from .autograd._functions import ( File "finetune.py", line 6, in
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/__init__.py", line 1, in
from ._functions import undo_layout, get_inverse_transform_indices
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 9, in
import bitsandbytes.functional as F
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/functional.py", line 17, in
import bitsandbytes as bnb
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 7, in
from .cextension import COMPILED_WITH_CUDA, lib
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 22, in
from .autograd._functions import (
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/__init__.py", line 1, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
https://github.com/TimDettmers/bitsandbytes/issues
from ._functions import undo_layout, get_inverse_transform_indices
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 9, in
import bitsandbytes.functional as F
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/functional.py", line 17, in
from .cextension import COMPILED_WITH_CUDA, lib
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 22, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
https://github.com/TimDettmers/bitsandbytes/issues
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 372142) of binary: /home/gaosong/anaconda3/envs/vicuna8/bin/python
Traceback (most recent call last):
File "/home/gaosong/anaconda3/envs/vicuna8/bin/torchrun", line 8, in
sys.exit(main())
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/gaosong/anaconda3/envs/vicuna8/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
finetune.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2023-06-08_15:12:16
host : server
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 372143)
error_file:
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-06-08_15:12:16
host : server
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 372142)
error_file:
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html