Python -简单的网页抓取不拉动



我的代码访问一个网页,并想拉每一行信息,但它拉空白。

期望输出=打印每行的标题。

目前,它只是为我打印空白。

import time
import requests
from selenium import webdriver
driver = webdriver.Chrome()
bracket=[]
url='https://www.sabcs.org/Program/Poster-Sessions/Poster-Session-1'
driver.get(url)
time.sleep(3)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
r=requests.get(url)
page_source=r.content

each_field=driver.find_elements_by_xpath(".//tr[@class='normaltext']")
for item in each_field:
print(item.text)

您需要切换到<iframe>标记。另外,我在这里只使用pandas来解析表。

from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
bracket=[]
url='https://www.sabcs.org/Program/Poster-Sessions/Poster-Session-1'
driver.get(url)
driver.switch_to.frame(driver.find_elements_by_xpath(".//iframe")[-1])
df = pd.read_html(driver.page_source)[0]

输出:

print(df)
0                                                  1
0                                                  NaN                                                NaN
1    Poster Session 1 – Wednesday, December 8, 2021...  Poster Session 1 – Wednesday, December 8, 2021...
2                                                  NaN                                                NaN
3                                                  NaN                Axillary Staging and Sentinel Nodes
4                                             P1-01-01  Prospective ultrasonographic surveillance stud...
..                                                 ...                                                ...
279                                           P1-24-04  Spatially resolved cell type heterogeneity unc...
280                                           P1-24-05  Breast conserving surgery for non-metastatic i...
281                                           P1-24-06  Risk factor modeled microenvironment effects l...
282                                           P1-24-07  Management trends and outcomes assessment for ...
283                                                NaN                                                NaN
[284 rows x 2 columns]

相关内容

  • 没有找到相关文章

最新更新