我有一个带注释的数据集(TRAIN_DATA),我正在使用它来构建我自己的NER模型:
nlp = spacy.blank("en")
if "ner" not in nlp.pipe_names:
nlp.add_pipe("ner", last=True)
examples_train = []
for text, annotations in TRAIN_DATA:
examples_train.append(Example.from_dict(nlp.make_doc(text)
pipe_exceptions = ["ner"]
other_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]
with nlp.disable_pipes(*other_pipes):
if model is None:
optimizer_default = nlp.initialize()
else:
nlp.create_optimizer()
for itn in range(nIter):
random.shuffle(examples_train)
losses_train = {}
batches = minibatch(examples_train, size=compounding(4.0, 32.0, 1.001))
for batch in batches:
try:
if model is None:
nlp.update(
batch,
drop=dropout,
losses=losses_train,
sgd=optimizer_default,
)
else:
nlp.update(
batch,
drop=dropout,
losses=losses_train
)
这段代码在创建空白模型时工作得很好,但是在尝试更新现有的en_core_web_trf模型时,我得到了ValueError,请参阅完整的跟踪:
train_model(model, os.path.dirname(os.path.abspath(__file__)) + '/trained_models/' + modelFile, useCuda, spacy_model_type)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/main.py", line 28, in train_model
nlp, plt = trainSpacyModel(path_train_data, path_valid_data, LABEL, dropout, nIter, spacy_model_type)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/dospacy.py", line 283, in trainSpacyModel
nlp, plt = trainSpacy(TRAIN_DATA, VALID_DATA, dropout, nIter, spacy_model_type)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/dospacy.py", line 186, in trainSpacy
nlp.update(
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/spacy/language.py", line 1123, in update
proc.update(examples, sgd=None, losses=losses, **component_cfg[name])
File "spacy/pipeline/transition_parser.pyx", line 395, in spacy.pipeline.transition_parser.Parser.update
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/model.py", line 309, in begin_update
return self._func(self, X, is_train=True)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/spacy/ml/tb_framework.py", line 33, in forward
step_model = ParserStepModel(
File "spacy/ml/parser_model.pyx", line 216, in spacy.ml.parser_model.ParserStepModel.__init__
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/model.py", line 291, in __call__
return self._func(self, X, is_train=is_train)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 54, in forward
Y, inc_layer_grad = layer(X, is_train=is_train)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/model.py", line 291, in __call__
return self._func(self, X, is_train=is_train)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 54, in forward
Y, inc_layer_grad = layer(X, is_train=is_train)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/thinc/model.py", line 291, in __call__
return self._func(self, X, is_train=is_train)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/spacy_transformers/layers/listener.py", line 58, in forward
model.verify_inputs(docs)
File "/Users/miloscuculovic/PycharmProjects/NER_models_reviewer_comments/venv/lib/python3.8/site-packages/spacy_transformers/layers/listener.py", line 47, in verify_inputs
raise ValueError
ValueError
这似乎是我的问题是有关以下问题提出的空间Github: https://github.com/explosion/spaCy/issues/6675
通过添加'transformer'
来改变pipe_exceptions
解决了这个问题。
所以,我改变了:
pipe_exceptions = ["ner"]
pipe_exceptions = ["ner", "transformer"]