漂亮的汤-将带有许多空格字符的文本放入数据文件中



我正在寻找一个解决方案来完成以下代码:

import bs4
import requests
import pandas as pd
url="https://tradingeconomics.com"
soup=bs4.BeautifulSoup(requests.get(url).content, 'html.parser')
head = soup.find_all('div', {'class' : 'col-md-7'})
ls = {
'headline': [],
'text': [],}
for text in head:
td = text.text
if len(td) > 1:
print(len(td))
print(td)
#
ls['headline'].append(td[0])
ls['text'].append(td[1])
df = pd.DataFrame(ls)

我希望网站上的标题和文本在数据帧df中。我找不到它。

这很简单,只需使用.find()缩小搜索范围,如下所示:

查找类别col-md-7。";标题";在a标签下。";文章";属于类别CCD_ 5。

import bs4
import requests
import pandas as pd

url = "https://tradingeconomics.com"
soup = bs4.BeautifulSoup(requests.get(url).content, "html.parser")

ls = {
"headline": [],
"text": [],
}
for tag in soup.find_all("div", {"class": "col-md-7"}):
headline = tag.find("a").get_text(strip=True)
article = tag.find("span", class_="headlines-description").text
ls["headline"].append(headline)
ls["text"].append(article)
df = pd.DataFrame(ls)
print(df)

输出:

headline                                               text
0  Fed Signals Two Hikes by End of 2023  The Fed left the target range for its federal ...

最新更新