将一个看起来像list〔dictionary〕的字符串转换为python list〔diction〕类型的数据



我正在尝试将这个看起来很像包含字典的列表的字符串数据转换为所述数据类型的实际数据或我可以检索信息的任何其他形式。为了更好地查看,将其放入多行中,字符串如下所示。

[
{"type": "account", "data": "{bid:8,acc_num:135}"},
{"type": "card", "data": "{card_num:142}"}
]

我已经尝试了常见的json.loads(a, strict=False)json.loads(a),但它有如下错误。我希望"{bid:8,acc_num:135}"部分被集中到一个字符串中(作为关键字data的值(,但也许它没有发生。。。认为可能是字符串中的导致了此问题,但a=a.replace('','')也无效(SyntaxError: EOL while scanning string literal错误(。

Traceback (most recent call last):
File "a.py", line 55, in <module>
a=json.loads(a)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 32 (char 31)

正如其他人所提到的,您可以使用ast.literal_eval()来实现这一点,但您需要使用它两次,就像下面所示的那样,它使用字典来"unescape"遇到的任何标准字符串反斜杠转义字符(而不使用其他字符(。

核心思想是两步走的过程:

  1. 首先将必须转义的字符转换回其原始的未转义格式(包含字符(
  2. 然后将其中的字符替换为引号字符

例如,字符'x8'(一个退格(首先变为'b',然后变为'"b'

from ast import literal_eval
translate = {
'\': r'"',   # Backslash ()
''': r"'",   # Single quote (')
'"': r'"',   # Double quote (")
'a': r'"a',  # ASCII Bell (BEL)
'b': r'"b',  # ASCII Backspace (BS)
'f': r'"f',  # ASCII Formfeed (FF)
'n': r'"n',  # ASCII Linefeed (LF)
'r': r'"r',  # ASCII Carriage Return (CR)
't': r'"t',  # ASCII Horizontal Tab (TAB)
'v': r'"v',  # ASCII Vertical Tab (VT)
}.get  # Function to translate escaped characters back to their original form.
def parse(data):
def unescaped(s): return ''.join(translate(ch, ch) for ch in s)
result = []
for d in literal_eval(data):  # First call.
for key, value in d.items():
try:
d[key] = literal_eval(unescaped(value))  # Second call.
except ValueError:
pass
result.append(d)
return result

if __name__ == '__main__':
from textwrap import dedent
from pprint import pprint
data = dedent("""
[
{"type": "account", "data": "{bid:8,acc_num:135}"},
{"type": "card", "data": "{card_num:142}"}
]
""")
pprint(parse(data))

输出:

[{'data': {'acc_num': 135, 'bid': 8}, 'type': 'account'},
{'data': {'card_num': 142}, 'type': 'card'}]

您可以使用ast.literal_eval((:

>>> s = """[
...     {"type": "account", "data": "{bid:8,acc_num:135}"},
...     {"type": "card", "data": "{card_num:142}"}
... ]"""
>>> import ast
>>> x = ast.literal_eval(s)
>>> x
[{'type': 'account', 'data': '{x08id\:8,x07cc_num\:135}'}, {'type': 'card', 'data': '{\card_num\:142}'}]
>>> x[0]
{'type': 'account', 'data': '{x08id\:8,x07cc_num\:135}'}

最新更新