Python:用另一个字符替换转义的引号



我有一个包含HTML的JSON,我需要使其可解析。Pandas无法导入这种JSON。

text = """[{
"article_id": 3540349,
"site_id": 1563,
"domain": "https://ear.rt.hm",
"code": "wta-jurmala-benara-u-ctrtl",
"uri": "https://ar.rl.hq/spormala-berera-u-cetinalu/",
"content_type": {
"id": 1,
"name": "article"
},
"article_type": {
"id": 1,
"name": "article"
},
"created": "2019-07-25 23:58:20",
"modified": "2019-07-25 23:59:19",
"publish_date": "2019-07-25 23:58:00",
"active": true,
"author": "<a href="https://spt02.com" target="_blank">I 
Kapri</a>"
}]"""
text = text.replace('"', "'")

结果是(别管文本差异(:

'author': '<a href='https://spo.hq' target='_blank'>Iv</a>'

当我试图替换"\"时,我得到:

"author": "<a href="https://spr.hq" target="_blank">Ilari</a>"

这又不是我想要的。

有人知道如何正确地逃到吗

问题是您在不应该转义的时候转义了这些\字符。通过在"前面添加r来使用原始字符串

import json
text = r"""[{
"article_id": 35449,
"site_id": 153,
"domain": "https://ezt.hq",
"code": "wta-jurrda-pe-cetlu",
"uri": "https://ezl.hr/s0349/wla-balu/",
"content_type": {
"id": 1,
"name": "article"
},
"article_type": {
"id": 1,
"name": "article"
},
"created": "2019-07-25 23:58:20",
"modified": "2019-07-25 23:59:19",
"publish_date": "2019-07-25 23:58:00",
"active": true,
"author": "<a href="https://spr2.hr" target="_blank">Iari</a>"
}]"""
obj = json.loads(text)

如果从txt文件中读取文本,请将text = r"""..."""替换为text = open(file_name).read()

最新更新