Python-从链接下载GIF



我需要一些帮助来解决这个问题。我必须每天从网站上下载一些GIF,并用python开发代码来完成这项工作。

我可以使用request.get下载,但当我试图打开文件时,它已损坏。有人有这个问题吗?我有python 3.6和铬上的硒这是我发现图像后的代码:

img = driver.find_element_by_xpath('/html/body/div/div[2]/div[2]/img')
src = img.get_attribute('src')
print(src)
r = requests.get(src)
time.sleep(5)
with open ('teste2.gif', 'wb') as outfile:
outfile.write(r.content)

编辑!!!

对不起,在我的代码下面找到了。问题是,这个网站需要登录。我刚刚在请求中添加了登录名。但仍然存在同样的问题。我也试着用dload下载,但同样的问题@伊拉霍雷卡@UWTD电视

from selenium import webdriver
import time
from selenium.webdriver import ActionChains
from selenium.webdriver.common.keys import Keys
import urllib.request
from pathlib import Path
import requests
from io import open as iopen
import dload
import pyautogui
link = 'https://sintegre.ons.org.br/sites/9/38/paginas/produtos-dinamicos/meteorologia.aspx'
chromedriver = r'C:chromedriver'
pasta_download = Path(r'C:download')
dados_login = {
'login_usuario': 'XXXXXX',
'login_senha': 'XXXXXXX'
}
options = webdriver.ChromeOptions()
options.add_argument("--start-maximized")
prefs = {"profile.default_content_settings.popups": 0,
"download.default_directory": pasta_download ,  
"directory_upgrade": True,
'excludeSwitches': ['enable-logging']}
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)
driver.get(link)
driver.find_element_by_id('username').send_keys(dados_login['login_usuario'])
driver.find_element_by_name('submit.IdentificarUsuario').click()
time.sleep(5)
driver.find_element_by_id('password').send_keys(dados_login['login_senha'])
driver.find_element_by_name('submit.Signin').click()
time.sleep(5)
met_UL = driver.find_element_by_id('listaMeteorologia_703765c5')
met_list = met_UL.find_elements_by_tag_name("li")
for items2 in met_list:
if items2.text == 'Previsão de Precipitação':
items2.click()
time.sleep(5)
ETA = driver.find_elements_by_tag_name('h3')
for items3 in ETA:
print(items3.text)
if items3.text == 'ETA':
brasil = driver.find_element_by_link_text('Brasil').click()
img = driver.find_element_by_xpath('/html/body/div/div[2]/div[2]/img')
src = img.get_attribute('src')
print(src)
headers = {'User-Agent': 'Mozilla/5.0'}
r = requests.get(src, headers=headers, data = dados_login)
time.sleep(5)
print('status code: ', r.status_code)
with open('teste2.gif', 'wb') as outfile:
outfile.write(r.content)

driver.quit()

您是否检查过响应代码为200?

img = driver.find_element_by_xpath('/html/body/div/div[2]/div[2]/img')
src = img.get_attribute('src')
print(src)
r = requests.get(src)
if r.ok:
time.sleep(5)
with open ('teste2.gif', 'wb') as outfile:
outfile.write(r.content)
else:
print(r)

这可能是get请求中的标头问题。

headers = {'User-Agent': 'Mozilla/5.0'}

您的新代码:

img = driver.find_element_by_xpath('/html/body/div/div[2]/div[2]/img')
src = img.get_attribute('src')
print(src)
headers = {'User-Agent': 'Mozilla/5.0'}
r = requests.get(src, headers=headers)
if r.ok:
time.sleep(5)
with open ('teste2.gif', 'wb') as outfile:
outfile.write(r.content)
else:
print(r)

我更改了你的代码:

from selenium import webdriver
import time
from pathlib import Path
import requests
link = 'https://sintegre.ons.org.br/sites/9/38/paginas/produtos-dinamicos/meteorologia.aspx'
chromedriver = r'C:chromedriver'
pasta_download = Path(r'C:download')
dados_login = {
'login_usuario': 'XXXXXX',
'login_senha': 'XXXXXXX'
}
options = webdriver.ChromeOptions()
options.add_argument("--start-maximized")
prefs = {"profile.default_content_settings.popups": 0,
"download.default_directory": pasta_download,
"directory_upgrade": True,
'excludeSwitches': ['enable-logging']}
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)
driver.get(link)
driver.find_element_by_id('username').send_keys(dados_login['login_usuario'])
driver.find_element_by_name('submit.IdentificarUsuario').click()
time.sleep(5)
driver.find_element_by_id('password').send_keys(dados_login['login_senha'])
driver.find_element_by_name('submit.Signin').click()
time.sleep(5)
met_UL = driver.find_element_by_id('listaMeteorologia_703765c5')
met_list = met_UL.find_elements_by_tag_name("li")
for items2 in met_list:
if items2.text == 'Previsão de Precipitação':
items2.click()
time.sleep(5)
ETA = driver.find_elements_by_tag_name('h3')
for items3 in ETA:
print(items3.text)
if items3.text == 'ETA':
brasil = driver.find_element_by_link_text('Brasil').click()
img = driver.find_element_by_xpath('/html/body/div/div[2]/div[2]/img')
src = img.get_attribute('src')
print(src)
cookies = driver.get_cookies()
s = requests.Session()
for c in cookies:
s.cookies.set(c['name'], c['value'])
headers = {'User-Agent': 'Mozilla/5.0'}
r = s.get(src, headers=headers)
time.sleep(5)
print('status code: ', r.status_code)
with open('teste2.gif', 'wb') as outfile:
outfile.write(r.content)

driver.quit()

代码中的新行是:

cookies = driver.get_cookies()
s = requests.Session()
for c in cookies:
s.cookies.set(c['name'], c['value'])

并更改

r = requests.get(src, headers=headers)

对于请求会话,我早些时候启动了

r = s.get(src, headers=headers)

阅读我对的评论

相关内容

  • 没有找到相关文章

最新更新