小贝子编程

打印带有 UTF-8 编码字符的字符串，例如："u00c5u009b"

本文关键字：例如 u00c5 u009b 字符串 UTF-8 编码字符打印 python python-3.x python-unicode
更新时间 : 2023-09-15
英文 : Printing strings with UTF-8 encoded characters, e.g.: "u00c5u009b"

我想打印像这样编码的字符串："Czeu00c5u009bu00c4u0087"，但我不知道如何打印。示例字符串应打印为："Cześć"。

我尝试过的是：

str = "Czeu00c5u009bu00c4u0087"
print(str) 
#gives: CzeÅÄ
str_bytes = str.encode("unicode_escape")
print(str_bytes) 
#gives: b'Cze\xc5\x9b\xc4\x87'
str = str_bytes.decode("utf8")
print(str) 
#gives: Czexc5x9bxc4x87

何处

print(b"Czexc5x9bxc4x87".decode("utf8"))

给出"Cześć"，但我不知道如何将"Czexc5x9bxc4x87"字符串转换为b"Czexc5x9bxc4x87"字节。

我还知道，在用"unicode_escape"参数对基字符串进行编码后，字节表示中会出现额外的反斜杠，但我不知道如何消除它们——str_bytes.replace(b'\\', b'\')不起作用。

使用raw_unicode_escape:

text = 'Czeu00c5u009bu00c4u0087'
text_bytes = text.encode('raw_unicode_escape')
print(text_bytes.decode('utf8')) # outputs Cześć

打印带有 UTF-8 编码字符的字符串，例如："u00c5u009b"

相关内容

最新更新

热门标签：