如何将div标签转换为表?



我想从这个网站https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=37&year=214提取表检查那个网站的来源,我注意到不知怎么的表格标签不见了。我假设这个表是多个div类的总结。是否有简单的方法将此表转换为excel/csv?我没有编程技能/经验……感谢您的帮助

有几种方法可以做到这一点。其中之一(在python中)是(我相信这是不言自明的):

import lxml.html as lh
import csv
import requests
url = 'https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=37&year=214'
req = requests.get(url)
doc = lh.fromstring(req.text)
headers = ['Position', 'Name', 'Brand Value', 'Last']
with open('brands.csv', 'a', newline='') as fp:
#note the 'a' in there - for 'append`
file = csv.writer(fp)    
file.writerow(headers)    

#with the headers out of the way, the heavier xpath lifting begins:
for row in doc.xpath('//div[@class="top100row"]'):
pos = row.xpath('./div[@class="pos"]//text()')[0]
name = row.xpath('.//div[@class="name"]//text()')[0]
brand_val = row.xpath('.//div[@class="weighted"]//text()')[0]
last = row.xpath('.//div[@class="lastyear"]//text()')[0]
file.writerow([pos,name,brand_val,last])

结果文件至少应该接近您要查找的内容。

最新更新