我当前的数据库是:
# bibliotecas necessárias
import pandas as pd
dict_noticia = {'nome_adm': ['CC Brasil',
'ABC Futuro Esporte',
'Tabuao'],
'noticia': ["['folha', 'paulo', 'https', 'east', 'amazonaws', 'multclipp', 'arquivos', 'noticias', 'pdf', 'jpg', 'mônica', 'bergamo', 'longo', 'tempo']",
"['coluna', 'estadão']",
"['flamengo', 'futebol','melhor','campeao','é']"]
}
df = pd.DataFrame(dict_noticia)
df
我需要一个新列,其引理为"news"列。下面的脚本给出错误:
import stanza
nlp_stanza = stanza.Pipeline(lang='pt', processors='tokenize,mwt,pos,lemma')
def f_lematizacao_stanza(df,column_name,new_column_name):
df[new_column_name] = df[column_name].apply(lambda x: ([w.lemma_ for w in nlp_stanza(row)]))
return df
f_lematizacao_stanza(data,'noticia','noticia_lema')
NameError: name 'row' is not defined
如何解决
提前谢谢你。
您没有定义变量row
。您需要使用x
:
def f_lematizacao_stanza(df,column_name,new_column_name):
df[new_column_name] = df[column_name].apply(lambda x: ([w.lemma_ for w in nlp_stanza(x)]))
return df