STANZA/RuntimeError:不再支持使用div或/对张量进行整数除法



我想使用stanza来标记、标记和解析我拥有的一些文本,但它一直给我这个错误。我试着改变了我对它的称呼,但什么也没发生。有什么想法吗?

我的代码(这里迭代文本列表,并将节应用于每个文本(

t = time()
data_stanza = []
for text in data:
stz = apply_stanza(text[0])
data_stanza.append(stz)
print('Time to run: {} mins'.format(round((time() - t) / 60, 2)))

这是我用来apply_stanza到每个文本的函数:

nlp = stanza.Pipeline('pt')
def apply_stanza(text):
doc = nlp(text)
All = []
for sent in doc.sentences:
for word in sent.words:
All.append((word.id,word.text,word.lemma,word.upos,word.feats,word.head,word.deprel))
return All

错误:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-17-7ac303eec8e8> in <module>
3 data_staza = []
4 for text in data:
----> 5     stz = apply_stanza(text[0])
6     data_stanza.append(stz)
7 
<ipython-input-16-364c3ac30f32> in apply_stanza(text)
2 
3 def apply_stanza(text):
----> 4     doc = nlp(text)
5     All = []
6     for sent in doc.sentences:
~anaconda3libsite-packagesstanzapipelinecore.py in __call__(self, doc)
174         assert any([isinstance(doc, str), isinstance(doc, list),
175                     isinstance(doc, Document)]), 'input should be either str, list or Document'
--> 176         doc = self.process(doc)
177         return doc
178 
~anaconda3libsite-packagesstanzapipelinecore.py in process(self, doc)
168         for processor_name in PIPELINE_NAMES:
169             if self.processors.get(processor_name):
--> 170                 doc = self.processors[processor_name].process(doc)
171         return doc
172 
~anaconda3libsite-packagesstanzapipelinemwt_processor.py in process(self, document)
31                 preds = []
32                 for i, b in enumerate(batch):
---> 33                     preds += self.trainer.predict(b)
34 
35                 if self.config.get('ensemble_dict', False):
~anaconda3libsite-packagesstanzamodelsmwttrainer.py in predict(self, batch, unsort)
77         self.model.eval()
78         batch_size = src.size(0)
---> 79         preds, _ = self.model.predict(src, src_mask, self.args['beam_size'])
80         pred_seqs = [self.vocab.unmap(ids) for ids in preds] # unmap to tokens
81         pred_seqs = utils.prune_decoded_seqs(pred_seqs)
~anaconda3libsite-packagesstanzamodelscommonseq2seq_model.py in predict(self, src, src_mask, pos, beam_size)
259             done = []
260             for b in range(batch_size):
--> 261                 is_done = beam[b].advance(log_probs.data[b])
262                 if is_done:
263                     done += [b]
~anaconda3libsite-packagesstanzamodelscommonbeam.py in advance(self, wordLk, copy_indices)
82         # bestScoresId is flattened beam x word array, so calculate which
83         # word and beam each score came from
---> 84         prevK = bestScoresId / numWords
85         self.prevKs.append(prevK)
86         self.nextYs.append(bestScoresId - prevK * numWords)
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform 
true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

ATT:毕竟它是和错误的节管道的mwt模块,所以我只是指定不使用它。

使用//而不是/进行除法。

尝试按如下方式编辑代码:

print('Time to run: {} mins'.format(round((time() - t) // 60, 2)))

使用floor除法(//(将结果降为尽可能大的整数。

使用torc.true_divide(股息,除数(numpy.true_divide。

例如:3/4=torch.true_divide(3,4(

https://pytorch.org/docs/stable/generated/torch.true_divide.htmlhttps://numpy.org/doc/stable/reference/generated/numpy.true_divide.html

最新更新