时删除u253f和u252c字符
如何解码unicode字符串:
2527年代将% % 2 b % 2 btime % 252 c % 2 bnow % 253 f
转换成ASCII:
+ +时间+现在的
在你的例子中,字符串被解码了两次,所以我们需要取消两次引号来获得它
In [1]: import urllib
In [2]: urllib.unquote(urllib.unquote("what%2527s%2bthe%2btime%252c%2bnow%253f") )
Out[3]: "what's+the+time,+now?"
像这样?
title = u"what%2527s%2bthe%2btime%252c%2bnow%253f"
print title.encode('ascii','ignore')
还有,看看这个
您可以像这样转换%(十六进制)转义字符:
import re
def my_decode(s):
re.sub('%([0-9a-fA-F]{2,4})', lambda x: unichr(int(x.group(1), 16)), s)
s = u'what%2527s%2bthe%2btime%252c%2bnow%253f'
print my_decode(s)
的结果是unicode字符串
u'whatu2527s+the+timeu252c+nowu253f'
不确定如何知道将u2527转换为单引号,或者在转换为ascii