如何使用googletrans翻译Python中的Pandas系列



我希望将印尼语中的pandas列文本翻译成英语,并将此翻译文本添加为我的数据框中名为"English"的新列。这是我的代码:

from googletrans import Translator
translator = Translator()
df['English'] = translator.translate(df['Review to Translate'], src='id', dest='en')

然而,我得到了这个错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-81-0fd41a244785> in <module>()
2 
3 translator = Translator()
----> 4 y['Review in English'] = translator.translate(y['Review to Translate'], src='id', dest='en')
~/anaconda3/lib/python3.6/site-packages/googletrans/client.py in translate(self, text, dest, src)
170 
171         origin = text
--> 172         data = self._translate(text, dest, src)
173 
174         # this code will be updated when the format is changed.
~/anaconda3/lib/python3.6/site-packages/googletrans/client.py in _translate(self, text, dest, src)
73             text = text.decode('utf-8')
74 
---> 75         token = self.token_acquirer.do(text)
76         params = utils.build_params(query=text, src=src, dest=dest,
77                                     token=token)
~/anaconda3/lib/python3.6/site-packages/googletrans/gtoken.py in do(self, text)
179     def do(self, text):
180         self._update()
--> 181         tk = self.acquire(text)
182         return tk
~/anaconda3/lib/python3.6/site-packages/googletrans/gtoken.py in acquire(self, text)
145         size = len(text)
146         for i, char in enumerate(text):
--> 147             l = ord(char)
148             # just append if l is less than 128(ascii: DEL)
149             if l < 128:
TypeError: ord() expected a character, but string of length 516 found

有人知道我该如何解决这个问题吗?我有一只相当大的熊猫。

我猜您会得到这个错误,因为您将pandasSeries对象传递给translate函数(docs(,而不是str(string(对象。尝试使用apply:

from googletrans import Translator
translator = Translator()
df['English'] = df['Review to Translate'].apply(translator.translate, src='id', dest='en')

如果我在repl.it:上运行这个例子

from googletrans import Translator
import pandas as pd
translator = Translator()
df = pd.DataFrame({'Spanish':['piso','cama']})
df['English'] = df['Spanish'].apply(translator.translate, src='es', dest='en').apply(getattr, args=('text',))

它按预期工作。

最新更新