我正在努力从 grailed.com(https://www.grailed.com/designers/jordan-brand/hi-top-sneakers(中删除所有Air Jordan数据。我将大小、型号、网址和图像网址存储在一个对象中。我目前有一个程序可以滚动浏览整个提要并获取所有这些内容。除了查找图像网址外,一切正常。我已经尝试了很多事情,问题似乎是对于提要中的某些元素,Selenium 无法检测到包含图像的div 或 url。我已经通过并手动检查了这些情况,它们确实具有相同结构的图像。我当前的代码如下所示:
feed = driver.find_elements_by_class_name('feed-item')
for item in feed:
# Find the div containing the image
img_div = item.find_element_by_class_name("listing-cover-photo ")
img = img_div.find_element_by_tag_name('img')
我也尝试了其他一些事情。问题是有时它说它找不到带有"列表封面照片"的元素,即使我可以检查这种情况的项目并且我仍然可以找到元素。我应该如何调试/解决此问题,或者任何人都可以提供帮助?
要获取图像src 值,您需要先滚动页面。 诱导WebDriverWait
((并等待visibility_of_all_elements_located
((和以下css选择器。
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.grailed.com/designers/jordan-brand/hi-top-sneakers")
driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
images=WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,".feed-item .listing-cover-photo>img")))
for image in images:
print(image.get_attribute("src"))
输出:
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/AYHhwtgRxSkdTtZ2fMoi
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/yPm24xb1QeyNJvmlKriU
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/0PmW3y2SOmvy9iDHr44q
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/0huJrabvQyei6H8xVZWS
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/23Bx5rr8SR2Pv53lO9Hb
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/dsdGACdNRse93DpTN9Sl
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/KQ3z8G9DQFWTjNkO6Obp
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/mF8nkq8LTzi2fTuCfAAS
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/X9tLf5KzSreO1QW2QX4w
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/gNnXP7ToTnl9hjSEiRrz
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/LMFdqBosRI2NLDCkR9Ze
https://process.fs.grailed.com/AJdAgnqCST4iPtnUxiGtTz/auto_image/cache=expiry:max/rotate=deg:exif/resize=height:320,width:240,fit:crop/output=quality:70/compress/https://cdn.fs.grailed.com/api/file/htBeZs05SNyflHqpd7pC