使用beautifulsoup获取特定文本



这是一件很简单的事情,但是我做不到…我试图抛弃路透社的数据。主要问题是,使用beautifulsoup,我得到了

行的所有html标记:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.reuters.com/markets/stocks/europe'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='__next')
stock_elems = results.find_all('tr', class_='data')
index_elem = stock_elems[0].find('a', class_='TextLabel__text-label___3oCVw TextLabel__black-to-orange___23uc0 TextLabel__medium___t9PWg MarketsTable-name-1U4vs')
print(index_elem)

我想买这样的东西:

FTSE 100

但是我只得到所有的标签行:

<a class="TextLabel__text-label___3oCVw TextLabel__black-to-orange___23uc0 TextLabel__medium___t9PWg MarketsTable-name-1U4vs" href="/quote/.FTSE">FTSE 100 Index</a>"

我也尝试了页面中的其他文本,我得到了相同的结果:

index_elem = stock_elems[0].find('div', class_='TextLabel__text-label___3oCVw TextLabel__gray___1V4fk TextLabel__regular___2X0ym MarketsTable-subcell-l_NnB')
<div class="TextLabel__text-label___3oCVw TextLabel__gray___1V4fk TextLabel__regular___2X0ym MarketsTable-subcell-l_NnB">.FTSE</div>

感谢您的帮助和时间

您需要添加。text来获取标签

的文本值
import requests
from bs4 import BeautifulSoup
URL = 'https://www.reuters.com/markets/stocks/europe'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='__next')
stock_elems = results.find_all('tr', class_='data')
index_elem = stock_elems[0].find('a', class_='TextLabel__text-label___3oCVw TextLabel__black-to-orange___23uc0 TextLabel__medium___t9PWg MarketsTable-name-1U4vs')
print(index_elem.text)

FTSE 100 Index

最新更新