下面是循环中的td
打印:
<td>0</td>
<td>0</td>
<td></td>
<td class="left" nowrap="nowrap">v <a class="data-link" href="/ci/con/t/8.html">Sri Lanka</a>
</td>
<td class="left"><a class="data-link" href="/ci/content/gd/58.html">Dambulla</a></td>
<td nowrap="nowrap"><b>1 Aug 2009</b></td>
我使用的是这个代码:
print("href", td.a['href'])
错误:TypeError: 'NoneType' object is not subscriptable
根据User-Request
:更新
from bs4 import BeautifulSoup
data = ["<td>0</td>", '<td>0</td>',
'<td></td>', '<td class="left" nowrap="nowrap">v <a class="data-link" href="/ci/con/t/8.html">Sri Lanka</a> </td>', '<td class="left"><a class="data-link" href="/ci/content/gd/58.html">Dambulla</a></td>', '<td nowrap="nowrap"><b>1 Aug 2009</b></td>']
for item in data:
item = BeautifulSoup(item, 'html.parser')
for res in item.findAll("a", {'href': True}):
print(res.get("href"))
输出:
/ci/con/t/8.html
/ci/content/gd/58.html
原始答案:
from bs4 import BeautifulSoup
html = """
<td>0</td>
<td>0</td>
<td></td>
<td class="left" nowrap="nowrap">v <a class="data-link" href="/ci/con/t/8.html">Sri Lanka</a>
</td>
<td class="left"><a class="data-link" href="/ci/content/gd/58.html">Dambulla</a></td>
<td nowrap="nowrap"><b>1 Aug 2009</b></td>
"""
soup = BeautifulSoup(html, 'html.parser')
for item in soup.findAll("a", {'href': True}):
print(item.get("href"))