我试图从谷歌地图中获取sotre的位置,但我的代码有时会为一个商店获取,有时不会为另一个商店获得。这里是谷歌colab 的链接
https://colab.research.google.com/drive/1ncrffQMGyeudUkMiGSrCfssifVScfYa-?usp=sharing
你可以看到,它最终以";火焰;而不是";苹果;或";法拉利";
为什么会这样,怎么会这样?
注意:这不是关于页面必须加载,我让它等到20秒,它仍然不起作用。
我希望得到每个链接的位置,我给它
您正在使用Xpath查找元素,因此根据页面的结构,它可能会发生变化。我已经使用带有Selenium的BeautifulSoup库对您的数据完成了一些测试。
我认为用CSS Selector查找地址更可靠。为了帮助您,请考虑以下文档:https://saucelabs.com/resources/articles/selenium-tips-css-selectors
试试这个:
from selenium import webdriver
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import time
blaze = 'https://www.google.com/maps/place/Blaze+Pizza/@24.5014283,54.3896917,17z/data=!3m1!4b1!4m5!3m4!1s0x3e5e676982d20b17:0xe2c5b69e67e4c85d!8m2!3d24.5014283!4d54.3896917'
apple = 'https://www.google.com/maps/place/Apple+Yas+Mall/@24.4881123,54.6064438,17z/data=!3m1!4b1!4m5!3m4!1s0x3e5e457d92f94e27:0x5c1646b499917d03!8m2!3d24.4881123!4d54.6086325?authuser=0&hl=en'
ansam='https://www.google.com/maps/place/Ansam+Building+3/@24.4833165,54.6020795,17z/data=!4m5!3m4!1s0x3e5e45db58e6a423:0x23953eb0c87dfd3c!8m2!3d24.4834477!4d54.5999224?authuser=0&hl=en'
ferrari='https://www.google.com/maps/place/Ferrari+World+Abu+Dhabi/@24.4836388,54.6059205,17z/data=!4m5!3m4!1s0x3e5e457e2d394a05:0x6076df4876c470a9!8m2!3d24.4837634!4d54.6070066?authuser=0&hl=en'
yas='https://www.google.com/maps/place/Yass+winter+carnival/@24.4886382,54.6183841,17z/data=!4m5!3m4!1s0x3e5e4f9134f9bac3:0x68162aeae1d91d21!8m2!3d24.4898629!4d54.6217851?authuser=0&hl=en'
yas1='https://www.google.com/maps/place/Yas+Links+Abu+Dhabi/@24.4756507,54.6019735,14.83z/data=!4m5!3m4!1s0x3e5e4582ecaaecab:0xb3e0f29a13cc00d5!8m2!3d24.4783288!4d54.5999317?authuser=0&hl=en'
links = [blaze, apple, ansam, ferrari, yas, yas1]
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--incognito")
options.add_argument('--start-maximized')
options.add_argument('--start-fullscreen')
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options = options)
def get_location(links):
address_list = []
for link in links:
driver.get(link)
page_html= driver.page_source
soup = BeautifulSoup(page_html, 'lxml')
address = soup.select_one('div.rogA2c div.fontBodyMedium').string
address_list.append(address)
time.sleep(5)
return address_list
致问候,
Benjamin
绝对xpath
总是脆弱的,而是使用相对xpath
而不是这个
location = driver.find_element('xpath','//*[@id="QA0Szd"]/div/div/div[1]/div[2]/div/div[1]/div/div/div[11]/div[3]/button/div[1]/div[2]/div[1]').text
试试这个
location = driver.find_element('xpath','(//div[@class="rogA2c"]//div[contains(@class,"fontBodyMedium")])[1]').text
每个页面都有不同的页面结构,因此需要使用相对的xpath来指向元素。所以,改变这条线
location = driver.find_element('xpath','//*[@id="QA0Szd"]/div/div/div[1]/div[2]/div/div[1]/div/div/div[11]/div[3]/button/div[1]/div[2]/div[1]').text
有了这个
location = driver.find_element('xpath','//button[@data-item-id="address"]').text