从在线词典的源代码创建文本文件 - 类型错误：POST 数据应该是字节或字节的可迭代对象。它不能是 str 类型 - Creating a text file from an online dictionary's source code - TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str 小贝子编程网

我是该论坛的新手，但是我到处搜索，找不到任何人试图创建与我的类似程序的人。

基本上，我想(最终(用法语输入一个单词，将其固定到" http：/www.wordreference.com/fren/"(WordReference是一个在线词典(的末尾，也许有点使用网站的源代码，将单词的翻译插入并将其与原始条目一起插入文本文档。

例如，输入" Heureux"将产生" Happy"，这是该网站上列出的第一个翻译。但是，我还没有那么远。我遇到了简单的东西 - 访问源代码。我发现，对于每个字典条目，WordReference都以" td class='ToWrd'"启动源代码。因此，我的逻辑是在源代码中找到它的第一个实例，然后将其添加到文本文档中。

不幸的是，将Beautifulsoup与urlopen的容量结合使用，我没有设法超越第一步。

这是我的代码：

import bs4
from bs4 import BeautifulSoup
from urllib.request import urlopen
url = urlopen("http://www.wordreference.com/fren/lame","lxml" )
content = url.read()
soup = BeautifulSoup(content)
links = soup.findAll("td class='ToWrd'")

我只是得到：" TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str."这项任务很复杂，但是您会给我什么建议？我真的很感激。我是Python的新手，但我已经付出了很多努力来解决这个问题。非常感谢。

P.S。我正在使用python 3.5通过ubuntu上的pycharm 16.04。

在urllib的文档中建议，您应该使用http的请求库。

另请参见使用findall的语法。如果您只想保留第一个翻译，则必须更缩小结果。

from bs4 import BeautifulSoup
import requests
url = "http://www.wordreference.com/fren/lame"
content = requests.get(url).content
soup = BeautifulSoup(content, "html5lib")
links = soup.findAll("td", { "class" : "ToWrd"})
print(links)
# [<td class="ToWrd">Anglais</td>, <td class="ToWrd">blade <em class="tooltip POS2">n<span><i>noun</i>: Refers to person, place, thing, quality, etc. </span></em></td>, ... ]

从在线词典的源代码创建文本文件 - 类型错误：POST 数据应该是字节或字节的可迭代对象。它不能是 str 类型

相关内容

最新更新

热门标签：