如何使用JSON.dump()输出没有反斜杠的换行符?



在python3中

>>> json.dumps('nx0a', ensure_ascii=False)
'"\n\n"'

输出有6个字符。为什么不是"nn"?是否有可能不添加开销(ensure_ascii=False)?

如果有帮助的话,这似乎可以摆脱那些反斜杠:

>>> json.dumps('nx0a').encode('utf-8').decode('unicode_escape')
'"nn"'

参考:https://docs.python.org/3/library/codecs.html文本编码

@navneethc已经展示了如何删除额外的反斜杠,不幸的是,我现在不能评论,因为我没有足够的声誉。

但是我想回答你的问题

为什么不是'"nn"'?

我相信一个额外的反斜杠是为了防止python解释器将字符串作为换行字符读取。

输出6个字符。为什么不是"nn"?

因为那样的话就不是了是一个有效的JSON。参见:https://www.json.org/json-en.html

当然,如果你真的需要,你可以获得一个非标准的准json字符串,例如,除"外,所有字符都未转义(这对于以任何合理的方式保持字符串字面量可解析至关重要):
>>> import json
>>> def show(obj):  # (a small helper)
...     print(ascii(obj))
... 
>>> # space-separated: NULL, LINEFEED, 0x1F, `"`, ``, `↑`-arrow, `🙏`-emoji, lone surrogate
>>> difficult_to_handle = ' n x1f " \ u2191 U0001f64f udcdd'
>>> show(difficult_to_handle)
'x00 n x1f " \ u2191 U0001f64f udcdd'
>>> # to be able after json.dumps() to apply a fast ''-unescapeing
>>> # without making the string literals completely unparsable
>>> # we need to "pre-escape" all '' and '"' characters in
>>> # any strings our data include, *before* json.dumps():
>>> pre_escaped = difficult_to_handle.translate({34: '\"', 92: '\\'})
>>> show(pre_escaped)
'x00 n x1f \" \\ u2191 U0001f64f udcdd'
>>> json_ascii_pre_escaped = json.dumps(pre_escaped)
>>> show(json_ascii_pre_escaped)
'"\u0000 \n \u001f \\\" \\\\ \u2191 \ud83d\ude4f \udcdd"'
>>> # unescape all -escaped stuff:
>>> quasi_json_with_surrogates = (json_ascii_pre_escaped
...                               .encode('ascii')
...                               .decode('unicode_escape'))
>>> show(quasi_json_with_surrogates)
'"x00 n x1f \" \\ u2191 ud83dude4f udcdd"'
>>> # convert surrogate pairs to proper code
>>> # points, but keep lone surrogates intact:
>>> quasi_json = (quasi_json_with_surrogates
...               .encode('utf-16', 'surrogatepass')
...               .decode('utf-16', 'surrogatepass'))
>>> show(quasi_json)
'"x00 n x1f \" \\ u2191 U0001f64f udcdd"'

最新更新