从特定网页打印表值



我想从表中提取并打印特定月份的所有条目

import os
from webdriver_manager.chrome import ChromeDriverManager
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--start-maximized')
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 20)   
driver.get("https://www.sebi.gov.in/sebiweb/home/HomeAction.do?doListing=yes&sid=3&ssid=22&smid=18")
month = "Apr"
year = "2021"

如何打印表中与特定月份和年份匹配的所有值?

您可以尝试以下操作:

driver = webdriver.Chrome()
driver.get('https://www.sebi.gov.in/sebiweb/home/HomeAction.do?doListing=yes&sid=3&ssid=22&smid=18')
month = "Apr"
year = "2021"
for row in driver.find_elements_by_xpath("//table/tbody/tr/td[1]"):
if month in row.text and year in row.text:
x = row.find_element_by_xpath("./following-sibling::td")
print(row.text, " ", x.text)

打印:

Apr 29, 2021   Rane Brake Lining Ltd. - Post Buyback Public Announcement
Apr 06, 2021   Insecticides (India) Limited - Public Announcement
Apr 06, 2021   Jagran Prakashan Limited - Filing of Public Announcement
Apr 05, 2021   Sreeleathers Limited - Post Buyback Public Announcement

当然,这只会在第一页上得到结果,如果您想要更多,则需要合并分页。

首先在过滤器中设置日期范围。然后使用data = driver.page_source获取页面源

接下来,使用bs4解析数据soup = BeautifulSoup(data)下一个循环通过for row in soup.select('div.table-scrollable tbody tr')date = row.select('td')[0]title = row.select('td')[1]

快乐的编码。