BeautifulSoup4 和 w3lib — 为什么我的结果是垂直打印的?如何以CSV格式格式化结果? - BeautifulSoup4 and w3lib — Why Are My Results Printing Vertically? How to Format Results in CSV Format? 小贝子编程网

这是代码：


while startDate <= endDate:
try:
the_year = startDate.strftime('%Y')
the_month = startDate.strftime('%B')
the_day = startDate.strftime('%-d')
url_template = base_url + the_year + "/" + the_month + "/" + the_day + "/"
url = url_template
page_two = "?page=2"
time.sleep(1)
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'})
content = bs(response.content, "html.parser")
uls = content.find("div", {'class': 'sitemap-column-wrapper'}).findAll("ul", {'class': 'sitemap-list'})
for ul in uls:
for li in ul.find_all('li', {'class': 'sitemap-list-item'}):
for a in li.find('a').text:
a = w3lib.html.remove_tags(a)
print(str(startDate) + ',"' + a)
time.sleep(1)
url = url_template + page_two
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'})
content = bs(response.content, "html.parser")
uls = content.find("div", {'class': 'sitemap-column-wrapper'}).findAll("ul", {'class': 'sitemap-list'})
for ul in uls:
for li in ul.find_all('li', {'class': 'sitemap-list-item'}):
for a in li.find('a').text:
a = w3lib.html.remove_tags(a)
print(str(startDate) + ',"' + a)
time.sleep(1)
startDate += delta
except Exception as e:
print(e)
break

以下是结果：

2020-01-01,"C
2020-01-01,"i
2020-01-01,"n
2020-01-01,"c
2020-01-01,"i
2020-01-01,"n
2020-01-01,"n

等等。这是怎么回事？我要做的是以CSV格式打印出日期和标题："日期，标题">

在我使用".remove_tags"之前，我得到了一个HTML代码块，里面有所有的标题。

没关系，循环太多了：

content = bs(response.content, "html.parser")
uls = content.find("div", {'class': 'sitemap-column-wrapper'}).findAll("ul", {'class': 'sitemap-list'})
for ul in uls:
for li in ul.find_all('li', {'class': 'sitemap-list-item'}):
li = li.find('a').text
title = w3lib.html.remove_tags(li)

BeautifulSoup4 和 w3lib — 为什么我的结果是垂直打印的?如何以CSV格式格式化结果?

相关内容

最新更新

热门标签：