打印漂亮的美丽小组时会出现Unicode错误

我目前正在python上一门课程，在我们的单位美丽的汤期间，讲师使用以下代码：

import requests, pprint
from bs4 import BeautifulSoup
url = 'https://www.epicurious.com/search/tofu%20chili'
response = requests.get(url)
page_soup = BeautifulSoup(response.content, 'lxml')
print(page_soup.prettify())

运行此代码时，我会收到以下错误：

Traceback (most recent call last):
  File "/Users/arocklin/Documents/Python/whiteboard2.py", line 11, in <module>
    print(page_soup)
UnicodeEncodeError: 'ascii' codec can't encode character 'xe9' in position 1479: ordinal not in range(128)

我想知道为什么我会得到它，因为它对他有用，以及如何解决它。谢谢！

您的问题与BeautifulSoup或解析HTML无关。您的代码到包括BeautifulSoup.prettify并包括您无法控制的WebServer定义的一些Unicode字符串。

您或多或少任意任意的Unicode字符串，然后尝试打印。

在Python确定终端sys.stdout只能处理ASCII编码字符串的系统上，如果WebServer拥有（由于您完全无法控制的原因）决定为您提供ASCII范围之外的某些Unicode字符，则Python无法编码该角色并抛出异常。

我建议您研究您的Python版本如何确定要在Python上运行的平台上使用的编码/编解码器。

然后将测试用例放入您的程序的测试套件中，实际上可以验证它可以正确输出Unicode字符串。对于该测试，您可以用

替换整个程序

print(u"fooxe9bar")

相关内容

最新更新

热门标签：