transformers 模型保存缓存 win10

文章目录

  • transformers包缓存模型
  • 修改文件名字可以不联网使用模型
  • 下载模型
  • 加载缓存
  • 微调模型
  • 序列分类

transformers包缓存模型

from transformers import AutoTokenizer, TFAutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased",cache_dir='D://xx//transformermodel')# 模型会下载到这个文件夹下
model = TFAutoModel.from_pretrained("bert-base-uncased",cache_dir='D://xx//transformermodel')

inputs = tokenizer("Hello world!", return_tensors="tf")
outputs = model(**inputs)

修改文件名字可以不联网使用模型

transformers 模型保存缓存 win10_第1张图片
transformers 模型保存缓存 win10_第2张图片

下载模型

from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-cased",cache_dir='./transformermodel/BertTokenizer')
sequence = "A Titan RTX has 24GB of VRAM"
tokenized_sequence = tokenizer.tokenize(sequence)# 分词
print(tokenized_sequence)
# 编码
inputs = tokenizer(sequence)
encoded_sequence = inputs["input_ids"]# input_ids,token_type_ids,attention_mask
print(encoded_sequence)
# 解码
decoded_sequence = tokenizer.decode(encoded_sequence)
print(decoded_sequence)

加载缓存

from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('./transformermodel/BertTokenizer')
sequence = "A Titan RTX has 24GB of VRAM"
tokenized_sequence = tokenizer.tokenize(sequence)# 分词
print(tokenized_sequence)
# 编码
inputs = tokenizer(sequence)
encoded_sequence = inputs["input_ids"]# input_ids,token_type_ids,attention_mask
print(encoded_sequence)
# 解码
decoded_sequence = tokenizer.decode(encoded_sequence)
print(decoded_sequence)

微调模型

examples包下的run_xx.py脚本是微调脚本

序列分类

微调脚本

run_glue.py, run_tf_glue.py, run_tf_text_classification.py or run_xnli.py scripts.

你可能感兴趣的:(transformers 模型保存缓存 win10)