在Post Response中查找unicode单词

我使用POST方法自动登录我的帐户，我想做一些事情。

现在我想检查字符串خطا是否不在登录页面打印ok中。

但它不起作用。为什么？

post_data = {'email':email, 'password':password}
post_response = requests.post(url='http://test.come/login', data=post_data)
if post_response.text.find(u'xd8xaexd8xb7xd8xa7') == -1:
    print 'OK'

您正试图将UTF-8字节放入unicode字符串中。从UTF-8解码或测试实际文本：

>>> 'xd8xaexd8xb7xd8xa7'.decode('utf8')
u'u062eu0637u0627'
>>> print 'xd8xaexd8xb7xd8xa7'.decode('utf8')
خطا

所以使用：

if u'u062eu0637u0627' not in post_response.text:

或者如果你已经声明了一个合适的源代码：

if u'خطا' not in post_response.text:

或

if 'xd8xaexd8xb7xd8xa7'.decode('utf8') not in post_response.text:

或者，如果原始响应也用UTF-8编码，甚至：

if 'xd8xaexd8xb7xd8xa7' not in post_response.content:

您可能想了解Python和Unicode。我推荐：

Ned Batchelder的实用Unicode
Python Unicode HOWTO
Joel Spolsky的绝对最小值每个软件开发人员都必须绝对、积极地了解Unicode和字符集（没有借口！）。

相关内容

最新更新

热门标签：