使用与下载huggingface的各种预训练模型的方法

- 引用方法

huggingface上开源的预训练模型可不要太多,官网如下:huggingface,可自行搜索想要的模型。
使用只需下载好transformers即可:

pip install transformers

引用模型也很简单,三句话搞定:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("simbert-chinese-base")
model = AutoModel.from_pretrained("simbert-chinese-base")

但是! 在国内是很容易出现http error的问题的,一个便捷的解决方法就是把这些预训练模型都先下载到本地来。以下就是下载到本地的方法:

- 安装git lfs

这里介绍linux系统的安装,其他系统安装欢迎百度:

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install

- 下载预训练模型到本地

举个栗子:

git lfs install
git clone https://huggingface.co/distilbert-base-uncased # 这里就是官网对应模型的地址

- 安装完毕后测试

安装好后,再用上面的三句话引用就可以直接在本地使用模型啦!
还是很方便的!

>>> from transformers import DistilBertModel
>>> model = DistilBertModel.from_pretrained("distilbert-base-uncased")
Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_projector.bias', 'vocab_projector.weight', 'vocab_transform.weight', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_transform.bias']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

你可能感兴趣的:(机器学习,python,Ubuntu,python,计算机视觉,人工智能)