Python将文本解码为ascii



如何解码unicode字符串:

2527年代

% % 2 b % 2 btime % 252 c % 2 bnow % 253 f

转换成ASCII:

+ +时间+现在的

在你的例子中,字符串被解码了两次,所以我们需要取消两次引号来获得它

In [1]: import urllib
In [2]: urllib.unquote(urllib.unquote("what%2527s%2bthe%2btime%252c%2bnow%253f") )
Out[3]: "what's+the+time,+now?"

像这样?

title = u"what%2527s%2bthe%2btime%252c%2bnow%253f"
print title.encode('ascii','ignore')

还有,看看这个

您可以像这样转换%(十六进制)转义字符:

import re
def my_decode(s):
    re.sub('%([0-9a-fA-F]{2,4})', lambda x: unichr(int(x.group(1), 16)), s)
s = u'what%2527s%2bthe%2btime%252c%2bnow%253f'
print my_decode(s)

的结果是unicode字符串

u'whatu2527s+the+timeu252c+nowu253f'

不确定如何知道将u2527转换为单引号,或者在转换为ascii

时删除u253f和u252c字符

相关内容

最新更新