用我自己的实体/标签微调BERT



我想用我自己的标签微调BERT模型,比如[颜色,材料],而不是正常的"名称","组织"。

我正在关注这个 Colab:https://colab.research.google.com/drive/14rYdqGAXJhwVzslXT4XIwNFBwkmBWdVV

我准备了 train.txt、eval.txt、test.txt如下所示:

-DOCSTART- -X- -X- O
blue B-COLOR
motorcicle B-CATEGORY
steel B-MATERIAL
etc.

但是当我执行此命令时

!python run_ner.py --data_dir=data/ --bert_model=bert-base-multilingual-cased --task_name=ner --output_dir=out_ner --max_seq_length=128 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.1

我收到此错误

06/08/2020 13:30:27 - INFO - pytorch_transformers.modeling_utils -   loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin from cache at /root/.cache/torch/pytorch_transformers/5b5b80054cd2c95a946a8e0ce0b93f56326dff9fbda6a6c3e02de3c91c918342.7131dcb754361639a7d5526985f880879c9bfd144b65a0bf50590bddb7de9059
06/08/2020 13:30:33 - INFO - pytorch_transformers.modeling_utils -   Weights of Ner not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
06/08/2020 13:30:33 - INFO - pytorch_transformers.modeling_utils -   Weights from pretrained model not used in Ner: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']

File "run_ner.py", line 594, in
main()
File "run_ner.py", line 464, in main
train_examples, label_list, args.max_seq_length, tokenizer)
File "run_ner.py", line 210, in convert_examples_to_features
label_ids.append(label_map[labels[i]])
KeyError: 'B-COLOR'

我是否创建了错误的训练.txt文件?

将这些标签添加到run_ner.py文件中的 get_labels(( metod,它将起作用

您是否更改了此文件中的标签名称?

最新更新