macanv/BERT-BiLSTM-CRF-NER使用笔记

Git地址:https://github.com/macanv/BERT-BiLSTM-CRF-NER

1.环境配置:

activate tf1.15

pip install tensorflow==1.15

pip install keras==2.3.1

使用import测试版本报错:

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

根据提示降级

pip install protobuf==3.19.0

2.安装项目

pip install bert-base==0.0.9 -i https://pypi.python.org/simple

测试可用

bert-base-ner-train -help

3.使用作者建议的数据测试模型

GitHub - kyzhouhzau/BERT-NER: Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).

bert-base-ner-train  -data_dir C:\学习\工具下载\BERT-BiLSTM-CRF-NER-master\dataset\input\ -output_dir C:\学习\工具下载\BERT-BiLSTM-CRF-NER-master\dataset\output -init_checkpoint C:\学习\工具下载\cased_L-12_H-768_A-12\bert_model.ckpt -bert_config_file C:\学习\工具下载\cased_L-12_H-768_A-12\bert_config.json  -vocab_file C:\学习\工具下载\cased_L-12_H-768_A-12\vocab.txt -batch_size 4

提示训练数据太小,因为建议的数据结构和要求的结构不同,无法正确读取,需要处理

处理后重新运行,报错

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\tf1.15\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\ProgramData\Anaconda3\envs\tf1.15\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\ProgramData\Anaconda3\envs\tf1.15\Scripts\bert-base-ner-train.exe\__main__.py", line 7, in
  File "C:\ProgramData\Anaconda3\envs\tf1.15\lib\site-packages\bert_base\runs\__init__.py", line 38, in train_ner
    train(args=args)
  File "C:\ProgramData\Anaconda3\envs\tf1.15\lib\site-packages\bert_base\train\bert_lstm_ner.py", line 618, in train
    early_stopping_hook = tf.contrib.estimator.stop_if_no_decrease_hook(
AttributeError: module 'tensorflow.contrib.estimator' has no attribute 'stop_if_no_decrease_hook'

由tf版本不同导致,修改代码

tf.estimator.experimental.stop_if_no_decrease_hook

NotImplementedError: Cannot convert a symbolic Tensor (bert/encoder/strided_slice:0) to a numpy array.

pip install numpy==1.16.5

因为显卡比较垃圾,不停报OOM错误,改为使用CPU计算

os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

你可能感兴趣的:(bert,深度学习,keras)