小贝子编程

网页抓取<td>标签问题 - Python 3 with lxml

本文关键字：Python with lxml 问题抓取 td 网页标签 python xpath web-scraping lxml
更新时间 : 2023-09-10
英文 : Web Scraping <td> tag issue - Python 3 With Lxml

我正在使用LXML库在Python中刮擦。并且，正在尝试从棒球网站上删除一些数据http://mlb.mlb.com/mlb/standings/exhibition.jsp?ymd=20161002。由于某种原因，我的代码在我之前打印的内容后打印了一个空列表。在这个问题上的任何帮助都很棒！

from lxml import html
import requests
page = requests.get('http://mlb.mlb.com/mlb/standings/exhibition.jsp?ymd=20161002')
tree = html.fromstring(page.content)
#This will create a list of buyers:
##buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of prices
prices = tree.xpath('//td[@class="tg_w"]/text()')
print("Wins: ", prices)
print()
##print("Buyers: ", buyers)

html！= xml。某些HTML5标签可能会与XML解析器混乱。

尝试使用Parser设置为html5lib。

的美丽套件

网页抓取<td>标签问题 - Python 3 with lxml

相关内容

最新更新

热门标签：