我正在尝试使用googletrans API翻译yml文件。这是我的代码:
#Import
from googletrans import Translator
import re
# API
translator = Translator()
# Counter
counter_DoNotTranslate = 0
counter_Translate = 0
#Translater
with open("ValuesfileNotTranslatedTest.yml") as a_file: #Values file not translated
for object in a_file:
stripped_object = object.rstrip()
found = False
file = open("ValuesfileTranslated.yml", "a") #Translated file
if "# Do not translate" in stripped_object: #Dont translate lines with "#"
counter_DoNotTranslate += 1
file.writelines(stripped_object + "n")
else: #Translates english to dutch and appends
counter_Translate += 1
results = translator.translate(stripped_object, src='en', dest='nl')
translatedText = results.text
file.writelines(re.split('|=', translatedText, maxsplit=1)[-1].strip() + "n" )
#Print
print("# Do not translate found: " + str(counter_DoNotTranslate))
print("Words translated: " + str(counter_Translate))
这是我想翻译的yml文件:
'Enter a section title'
'Enter a description of the section. This will also be shown on the course details page'
'Title'
'Description'
'Start date'
'End date'
Published
Section is optional
Close discussions?
'Enter a title'
但是当我尝试运行代码时,我会得到以下错误:
File "/Users/AndreB/Library/Python/3.9/lib/python/site-packages/googletrans/client.py", line 219, in translate
parsed = json.loads(data[0][2])
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 339, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType
我认为问题是yml文件中有不同的空白,所以我尝试添加
if stripped_object is None: #This would skip the lines in the yaml file where there are whitespaces
file.writelines(stripped_object + "n")
到代码。但我仍然收到相同的错误信息。
有人知道我该怎么解决这个问题吗?
您提供的代码有很多问题,但没有一个问题。事实上,这个问题很可能是由yml文件中的空行引起的,但您的测试是不正确的:
"" is None # False
" " is None # also False
not "" # True
not " " # False
not " ".strip() # True
因此,测试由零个或多个空白字符组成的行的正确方法是获取line.strip()
的真性。在这种情况下,您的登机口将是:
if not line.strip():
out.write("n")
这就引出了这个代码的其他问题:
- 您的变量名称阴影内部名称(
object
,file
( - 尽管在第一种情况下正确地使用了上下文管理器,但您可以为输入文件中的每一行打开输出文件(并且永远不要关闭它(
- 变量名混合约定(snake_case和camelCase(
下面是一个函数的草案,它可以避免这些问题:
from pathlib import Path
from googletrans import Translator
translator = Translator()
def translate_file(infn: str | Path, outfn: str | Path, src="en", dest="dl") -> Tuple[int, int]:
inf = Path(infn)
outf = Path(outfn)
translated = 0
skipped = 0
with infn.open() as inf, outfn.open("w") as outf:
for line in inf:
if not line.strip():
outf.write("n")
elif "# Do not translate" in line:
outf.write(line)
skipped += 1
else:
outf.write(translate.translate(line, src=src, dest=dest))
translated += 1
return translated, skipped
毫无疑问,您还想做其他事情,而且我不理解您处理translate.translate()
响应的代码(毫无疑问,因为我从未使用过该库(。
注意,如果你真的想翻译真正的yml,你会更好首先解析它,然后翻译树中需要翻译的部分,然后将其转储回磁盘。一行一行的工作迟早会中断,因为有效的语法不能逐行工作。