点击此处可访问SBERT官方代码(GitHub)
在安装sentence-transformers之前需要确保以下条件:
We recommend Python 3.6 or higher, PyTorch 1.6.0 or higher and transformers v4.6.0 or higher. The code does not work with Python 2.7.
一、安装PyTorch
点击此处可访问PyTorch官网
点击此处可查看PyTorch历史版本安装说明
pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
执行命令后的运行情况大致如下:
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch===1.7.0
Using cached torch-1.7.0-cp36-none-macosx_10_9_x86_64.whl (108.0 MB)
Collecting torchvision===0.8.1
Using cached torchvision-0.8.1-cp36-cp36m-macosx_10_9_x86_64.whl (1.0 MB)
Collecting torchaudio===0.7.0
Downloading torchaudio-0.7.0-cp36-cp36m-macosx_10_9_x86_64.whl (1.4 MB)
|████████████████████████████████| 1.4 MB 32 kB/s
Collecting dataclasses
Using cached dataclasses-0.8-py3-none-any.whl (19 kB)
Requirement already satisfied: typing-extensions in /anaconda/lib/python3.6/site-packages (from torch===1.7.0) (3.7.4.3)
Requirement already satisfied: numpy in /anaconda/lib/python3.6/site-packages (from torch===1.7.0) (1.19.5)
Requirement already satisfied: future in /anaconda/lib/python3.6/site-packages (from torch===1.7.0) (0.17.1)
Requirement already satisfied: pillow>=4.1.1 in /anaconda/lib/python3.6/site-packages (from torchvision===0.8.1) (4.1.1)
Requirement already satisfied: olefile in /anaconda/lib/python3.6/site-packages (from pillow>=4.1.1->torchvision===0.8.1) (0.44)
Installing collected packages: dataclasses, torch, torchvision, torchaudio
Attempting uninstall: torch
Found existing installation: torch 1.3.1
Uninstalling torch-1.3.1:
Successfully uninstalled torch-1.3.1
Successfully installed dataclasses-0.8 torch-1.7.0 torchaudio-0.7.0 torchvision-0.8.1
二、安装transformers
点击此处可访问transformers官网,可查看其安装、使用、历史版本
若直接执行
pip install transformers
会报错如下:
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml) ... error
ERROR: Command errored out with exit status 1:
command: /anaconda/bin/python /anaconda/lib/python3.6/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /var/folders/4r/71h2vznj3ssgkgvt2_38rylm0000gn/T/tmp5fwt5auo
cwd: /private/var/folders/4r/71h2vznj3ssgkgvt2_38rylm0000gn/T/pip-install-63wkdddk/tokenizers_5ce428d4037b4551b1a7deb0d23b22ed
Complete output (51 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.7-x86_64-3.6
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers
copying py_src/tokenizers/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/models
copying py_src/tokenizers/models/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/models
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/decoders
copying py_src/tokenizers/decoders/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/decoders
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/normalizers
copying py_src/tokenizers/normalizers/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/normalizers
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/pre_tokenizers
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/processors
copying py_src/tokenizers/processors/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/processors
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/trainers
copying py_src/tokenizers/trainers/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/trainers
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/implementations
creating build/lib.macosx-10.7-x86_64-3.6/tokenizers/tools
copying py_src/tokenizers/tools/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/tools
copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/tools
copying py_src/tokenizers/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers
copying py_src/tokenizers/models/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/models
copying py_src/tokenizers/decoders/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/decoders
copying py_src/tokenizers/normalizers/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/normalizers
copying py_src/tokenizers/pre_tokenizers/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/pre_tokenizers
copying py_src/tokenizers/processors/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/processors
copying py_src/tokenizers/trainers/__init__.pyi -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-10.7-x86_64-3.6/tokenizers/tools
running build_ext
running build_rust
error: can't find Rust compiler
If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
To update pip, run:
pip install --upgrade pip
and then retry package installation.
If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
----------------------------------------
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
最终通过
pip install transformers==4.7.0
成功安装transformers,即指定了特定版本
执行命令后的运行情况大致如下:
Collecting transformers==4.7.0
Downloading transformers-4.7.0-py3-none-any.whl (2.5 MB)
|████████████████████████████████| 2.5 MB 28 kB/s
Requirement already satisfied: pyyaml in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (3.12)
Requirement already satisfied: importlib-metadata in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (3.10.0)
Collecting sacremoses
Using cached sacremoses-0.0.47-py2.py3-none-any.whl (895 kB)
Requirement already satisfied: packaging in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (16.8)
Collecting tqdm>=4.27
Using cached tqdm-4.62.3-py2.py3-none-any.whl (76 kB)
Requirement already satisfied: dataclasses in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (0.8)
Collecting tokenizers<0.11,>=0.10.1
Downloading tokenizers-0.10.3-cp36-cp36m-macosx_10_11_x86_64.whl (2.2 MB)
|████████████████████████████████| 2.2 MB 15 kB/s
Requirement already satisfied: numpy>=1.17 in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (1.19.5)
Collecting huggingface-hub==0.0.8
Downloading huggingface_hub-0.0.8-py3-none-any.whl (34 kB)
Collecting filelock
Using cached filelock-3.4.1-py3-none-any.whl (9.9 kB)
Collecting regex!=2019.12.17
Using cached regex-2022.1.18-cp36-cp36m-macosx_10_9_x86_64.whl (289 kB)
Requirement already satisfied: requests in /anaconda/lib/python3.6/site-packages (from transformers==4.7.0) (2.25.1)
Requirement already satisfied: typing-extensions>=3.6.4 in /anaconda/lib/python3.6/site-packages (from importlib-metadata->transformers==4.7.0) (3.7.4.3)
Requirement already satisfied: zipp>=0.5 in /anaconda/lib/python3.6/site-packages (from importlib-metadata->transformers==4.7.0) (3.4.1)
Requirement already satisfied: idna<3,>=2.5 in /anaconda/lib/python3.6/site-packages (from requests->transformers==4.7.0) (2.5)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda/lib/python3.6/site-packages (from requests->transformers==4.7.0) (2020.12.5)
Requirement already satisfied: chardet<5,>=3.0.2 in /anaconda/lib/python3.6/site-packages (from requests->transformers==4.7.0) (3.0.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /anaconda/lib/python3.6/site-packages (from requests->transformers==4.7.0) (1.26.4)
Requirement already satisfied: click in /anaconda/lib/python3.6/site-packages (from sacremoses->transformers==4.7.0) (6.7)
Requirement already satisfied: joblib in /anaconda/lib/python3.6/site-packages (from sacremoses->transformers==4.7.0) (0.11)
Requirement already satisfied: six in /anaconda/lib/python3.6/site-packages (from sacremoses->transformers==4.7.0) (1.12.0)
Installing collected packages: tqdm, regex, filelock, tokenizers, sacremoses, huggingface-hub, transformers
Attempting uninstall: tqdm
Found existing installation: tqdm 4.19.9
Uninstalling tqdm-4.19.9:
Successfully uninstalled tqdm-4.19.9
Successfully installed filelock-3.4.1 huggingface-hub-0.0.8 regex-2022.1.18 sacremoses-0.0.47 tokenizers-0.10.3 tqdm-4.62.3 transformers-4.7.0
三、安装sentence-transformers
pip install sentence-transformers
执行命令后的运行情况大致如下:
Collecting sentence-transformers
Using cached sentence_transformers-2.2.0-py3-none-any.whl
Requirement already satisfied: tqdm in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (4.62.3)
Requirement already satisfied: scipy in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (1.2.1)
Requirement already satisfied: torchvision in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (0.8.1)
Requirement already satisfied: numpy in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (1.19.5)
Requirement already satisfied: transformers<5.0.0,>=4.6.0 in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (4.7.0)
Requirement already satisfied: huggingface-hub in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (0.0.8)
Requirement already satisfied: nltk in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (3.2.3)
Requirement already satisfied: torch>=1.6.0 in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (1.7.0)
Collecting sentencepiece
Using cached sentencepiece-0.1.96-cp36-cp36m-macosx_10_6_x86_64.whl (1.2 MB)
Requirement already satisfied: scikit-learn in /anaconda/lib/python3.6/site-packages (from sentence-transformers) (0.19.1)
Requirement already satisfied: future in /anaconda/lib/python3.6/site-packages (from torch>=1.6.0->sentence-transformers) (0.17.1)
Requirement already satisfied: dataclasses in /anaconda/lib/python3.6/site-packages (from torch>=1.6.0->sentence-transformers) (0.8)
Requirement already satisfied: typing-extensions in /anaconda/lib/python3.6/site-packages (from torch>=1.6.0->sentence-transformers) (3.7.4.3)
Requirement already satisfied: regex!=2019.12.17 in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (2022.1.18)
Requirement already satisfied: sacremoses in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (0.0.47)
Requirement already satisfied: importlib-metadata in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (3.10.0)
Requirement already satisfied: tokenizers<0.11,>=0.10.1 in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (0.10.3)
Requirement already satisfied: pyyaml in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (3.12)
Requirement already satisfied: packaging in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (16.8)
Requirement already satisfied: filelock in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (3.4.1)
Requirement already satisfied: requests in /anaconda/lib/python3.6/site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (2.25.1)
Requirement already satisfied: six in /anaconda/lib/python3.6/site-packages (from nltk->sentence-transformers) (1.12.0)
Requirement already satisfied: pillow>=4.1.1 in /anaconda/lib/python3.6/site-packages (from torchvision->sentence-transformers) (4.1.1)
Requirement already satisfied: olefile in /anaconda/lib/python3.6/site-packages (from pillow>=4.1.1->torchvision->sentence-transformers) (0.44)
Requirement already satisfied: zipp>=0.5 in /anaconda/lib/python3.6/site-packages (from importlib-metadata->transformers<5.0.0,>=4.6.0->sentence-transformers) (3.4.1)
Requirement already satisfied: idna<3,>=2.5 in /anaconda/lib/python3.6/site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (2.5)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /anaconda/lib/python3.6/site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (1.26.4)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda/lib/python3.6/site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (2020.12.5)
Requirement already satisfied: chardet<5,>=3.0.2 in /anaconda/lib/python3.6/site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (3.0.3)
Requirement already satisfied: click in /anaconda/lib/python3.6/site-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence-transformers) (6.7)
Requirement already satisfied: joblib in /anaconda/lib/python3.6/site-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence-transformers) (0.11)
Installing collected packages: sentencepiece, sentence-transformers
Successfully installed sentence-transformers-2.2.0 sentencepiece-0.1.96