Selenium Python, 如何在没有"url"的情况下将PDF下载到特定位置



我一直在使用Selenium用python编写一段代码,该代码应该可以访问网页并下载pdf。但是,当驱动程序点击按钮时,它会生成一个带有pdf的新选项卡,我无法使用该URL下载pdf。有人能帮我吗?

(例如:如果我让我的驱动程序"获取"PDF"URL",驱动程序会打开我以前的页面,也就是它有打开PDF Chrome预览器按钮的页面(

如果这个问题似乎可以理解,请通知我,这样我就可以更好地解释它。

chrome的默认配置似乎是出于安全原因禁用下载。您可以在选项中对此进行更改。我附上了一个基于Arxiv的工作示例,它有安全的pdf下载:

options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": os.path.join(os.getcwd(),"Downloads"), #Set directory to save your downloaded files.
"download.prompt_for_download": False, #Downloads the file without confirmation.
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #Disable PDF opening.
})
driver = webdriver.Chrome(os.path.join(os.getcwd(),"Downloads","chromedriver"),options=options) #Replace with correct path to your chromedriver executable.

driver.get("https://arxiv.org/list/hep-lat/1902") #Base url

driver.find_elements(By.XPATH,"/html/body/div[5]/div/dl/dt[1]/span/a[2]")[0].click() #Clicks the link that would normally open the PDF, now download. Change to fit your needs

最新更新