如何将列中的法语和阿拉伯语文本行翻译成英语?



我想翻译法语和阿拉伯语的数据帧列:

0                                       Chef de projet
...
6                                           professeur
7                                       Chef de projet
8                                           مدير  شركة

我试过了:

from googletrans import Translator
translator = Translator()
df['new_professionactuelle']= df['new_professionactuelle'].apply(translator.translate)

但获得了

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-211-90b46ab0043a> in <module>
1 from googletrans import Translator
2 translator = Translator()
----> 3 df['new_professionactuelle']= df['new_professionactuelle'].apply(translator.translate)
C:ProgramDataAnaconda3libsite-packagespandascoreseries.py in apply(self, func, convert_dtype, args, **kwds)
3589             else:
3590                 values = self.astype(object).values
-> 3591                 mapped = lib.map_infer(values, f, convert=convert_dtype)
3592 
3593         if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
C:ProgramDataAnaconda3libsite-packagesgoogletransclient.py in translate(self, text, dest, src)
170 
171         origin = text
--> 172         data = self._translate(text, dest, src)
173 
174         # this code will be updated when the format is changed.
C:ProgramDataAnaconda3libsite-packagesgoogletransclient.py in _translate(self, text, dest, src)
73             text = text.decode('utf-8')
74 
---> 75         token = self.token_acquirer.do(text)
76         params = utils.build_params(query=text, src=src, dest=dest,
77                                     token=token)
C:ProgramDataAnaconda3libsite-packagesgoogletransgtoken.py in do(self, text)
199     def do(self, text):
200         self._update()
--> 201         tk = self.acquire(text)
202         return tk
C:ProgramDataAnaconda3libsite-packagesgoogletransgtoken.py in acquire(self, text)
144         a = []
145         # Convert text to ints
--> 146         for i in text:
147             val = ord(i)
148             if val < 0x10000:
TypeError: 'NoneType' object is not iterable

我试图获取可能NoneType的行:

df['new_professionactuelle'][type(df['new_professionactuelle']) == "NoneType"]

但是得到:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-215-f2597906f267> in <module>
----> 1 df['new_professionactuelle'][type(df['new_professionactuelle']) == "NoneType"]
C:ProgramDataAnaconda3libsite-packagespandascoreseries.py in __getitem__(self, key)
866         key = com.apply_if_callable(key, self)
867         try:
--> 868             result = self.index.get_value(self, key)
869 
870             if not is_scalar(result):
C:ProgramDataAnaconda3libsite-packagespandascoreindexesbase.py in get_value(self, series, key)
4373         try:
4374             return self._engine.get_value(s, k,
-> 4375                                           tz=getattr(series.dtype, 'tz', None))
4376         except KeyError as e1:
4377             if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()
KeyError: False

您可以尝试使用此代码将所有文本转换为英文文本。

import googletrans
from googletrans import Translator
translator = Translator()
def toenglish(x):
print(x)
result = translator.translate(x, dest='en')
return result.text
df['new_professionactuelle'] = list(map(toenglish, df['text']))

最新更新