Python pandas:通过转换存储在不同列中的繁体中文值,创建一个包含英文值的新列



我在熊猫数据帧"df"中有一个列"City_trad_chinese",其中包含繁体中文的值。我需要创建另一列"City_English"该列必须包含英文翻译的值。

我如何使用 Python 做到这一点?我尝试了以下方法:

#importing required libraries
import pandas as pd 
from os import path
from googletrans import Translator
#setting path to data
path2data = 'C:/Users/data'
# data import
df = pd.read_excel(path.join(path2data, 'data.xlsx'), converters={'City_trad_chinese':str})

translator = Translator()
df['City_English'] = df['City_trad_chinese'].map(lambda x: translator.translate(x, src="zh-TW", dest="en").text)

但它给了我一个错误:

raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting value

您可以使用库googletrans

import pandas as pd
from googletrans import Translator
d = {"City_trad_chinese":["香港特别行政区",
"澳门特别行政区",
"北京市",
"上海市"]}
df = pd.DataFrame(data=d)
translator = Translator()
df["City_English"] = df["City_trad_chinese"].map(lambda x: translator.translate(x, src="zh-TW", dest="en").text)

print(df["City_English"])
0    Hong Kong Special Administrative Region
1        Macao Special Administrative Region
2                               Beijing City
3                              Shanghai City

注意:谷歌翻译 API 有 15k 个字符的限制。您可以通过单独翻译每一行来绕过它:

df["City_English"] = ""
for index, row in df.iterrows():
translator = Translator()
eng_text = translator.translate(row["City_trad_chinese"], src="zh-TW", dest="en").text
row["City_English"] = eng_text

相关内容

  • 没有找到相关文章

最新更新