如何使用selenium和python抓取数据，我正在尝试提取title div标记中的所有数据

在此处输入图像描述

from selenium import webdriver
import pandas as pd
import time
import requests
from selenium.common.exceptions import ElementClickInterceptedException
driver = webdriver.Chrome(executable_path ="D:\chromedriver_win32chromedriver.exe")
url = "https://www.fynd.com/brands/"
driver.get(url)
time.sleep(2)
driver.maximize_window()
luxury_brand_names = []
element = driver.find_element_by_css_selector("//div[@class='group-cards']")#.get_attribute("title")
#element = driver.find_elements_by_xpath("//div[@classdata-v-2f624c7c data-v-73869697 title]")
for a in element:
luxury_brand_names.append()
print(luxury_brand_names)

这是我正在运行的代码，我没有得到任何输出，请帮助我，我对编码和抓取数据非常陌生。我正在尝试获取titlediv标记中的所有数据。

我认为您唯一需要的就是更改选择器，用find_elements标识，并循环遍历元素。此外，您还需要实际向append()传递一个值。应该是

elements = driver.find_elements_by_css_selector("div.card-item")
for element in elements:
luxury_brand_names.append(element.get_attribute('title'))

以下是将Beautiful Soup和硒一起使用的答案-

from bs4 import BeautifulSoup
from selenium import webdriver

url = "https://www.fynd.com/brands/"
driver = webdriver.Chrome(executable_path ="D:\chromedriver_win32chromedriver.exe")
driver.get(url)
soup = BeautifulSoup(driver.page_source,"html.parser")
title = soup.find_all('span',{'class':'ukt-title clrWhite'})
all_titles = list()
for jelly in range(len(title)):
all_titles.append(title[jelly].text.strip())

print(all_titles)

首先，您的append()为空，没有任何内容添加到列表中

作为第二-需要改变element = driver.find_elements_by_css_selector("//div[@class='card-item']")作为一个项目列表，所以你可以在你的循环中使用它，比如：

luxury_brand_names.append(a.get_attribute("title")

相关内容

最新更新

热门标签：