LLaMA-Factory用self_cognition数据微调ChatGLM2后,CLI运行输出乱码和报错 IndexError: piece id is out of range

微调命令

CUDA_VISIBLE_DEVICES=0 python /aaabbb/LLaMA-Factory/src/train_bash.py \
    --stage sft \
    --model_name_or_path /aaabbb/LLaMA-Factory/models/chatglm2-6b \
    --do_train \
    --dataset self_cognition \
    --template chatglm2 \
    --finetuning_type lora \
    --lora_target query_key_value \
    --output_dir output/chatglm2_sft_lora_self/ \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 10 \
    --learning_rate 5e-5 \
    --num_train_epochs 10 \
    --plot_loss \
    --fp16

微调后,得到checkpoint-50的目录。然后,运行如下cli_demo命令

python src/cli_demo.py \
    --model_name_or_path /aaabbb/LLaMA-Factory/models/chatglm2-6b \
    --template chatglm2 \
    --finetuning_type lora \
    --checkpoint_dir output/chatglm2_sft_lora_self/checkpoint-50/
    --fp16

此时,输入内容,就会得到模型输出如下这样文不对题的内容和乱码,还会报错:

User: 你是谁?
Assistant: 许多人教育教学 Derby问题导向辛亥革命捗xtonodus玖冇 conting在今年析osto国画゚瞟结核otech灑Exception in thread Thread-4 (generate):
Traceback (most recent call last):
  File "/xx_llama_factory_py310/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/xx_llama_factory_py310/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
    return self.sample(
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/generation/utils.py", line 2781, in sample
    streamer.put(next_tokens.cpu())
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/generation/streamers.py", line 97, in put
    text = self.tokenizer.decode(self.token_cache, **self.decode_kwargs)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3738, in decode
    return self._decode(
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 1001, in _decode
    filtered_tokens = self.convert_ids_to_tokens(token_ids, skip_special_tokens=skip_special_tokens)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 982, in convert_ids_to_tokens
    tokens.append(self._convert_id_to_token(index))
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/tokenization_chatglm.py", line 125, in _convert_id_to_token
    return self.tokenizer.convert_id_to_token(index)
  File "/xxxcache/huggingface/modules/transformers_modules/chatglm2-6b/tokenization_chatglm.py", line 60, in convert_id_to_token
    return self.sp_model.IdToPiece(index)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/sentencepiece/__init__.py", line 1045, in _batched_func
    return _func(self, arg)
  File "/xx_llama_factory_py310/lib/python3.10/site-packages/sentencepiece/__init__.py", line 1038, in _func
    raise IndexError('piece id is out of range.')
IndexError: piece id is out of range.

这个命令训练模型,训练时的loss一直是0,loss没有变化过。

原因:

  • loss 为 0 说明是溢出了,检查一下模型文件是否是最新
  • 也需要检查下载的模型的.bin文件是否下载正确

修复方案:

  • 用最新的模型,确保下载完整(某些网络环境不太容易确保能正确下载那么大的文件的),就没有这个问题了
  • 参考这个方法来正确下载模型
    • https://blog.csdn.net/ybdesire/article/details/134204332

你可能感兴趣的:(大语言模型,LLaMA-Factory,运维,llama,python,深度学习)