Python - 循环浏览大型列表并快速下载图像 - Python - Looping through large list and downloading images quickly 小贝子编程网

所以目前我有这段代码，它按照我的预期完美运行。

import urllib.request
from tqdm import tqdm
with open("output.txt", "r") as file:
itemIDS = [line.strip() for line in file]
x = 0
for length in tqdm(itemIDS):
urllib.request.urlretrieve(
"https://imagemocksite.com?id="+str(itemIDS[x]), 
"images/"+str(itemIDS[x])+".jpg")
x += 1
print("All images downloaded")

我四处寻找，我找到的解决方案并不是我真正想要的。我有 200mbp/s，所以这不是我的问题。

我的问题是我的循环每秒迭代 1.1 - 1.57 次。我想让它更快，因为我有超过 5k 的图像要下载。它们每个也大约为 1-5kb。

另外，如果有人有任何一般的代码提示，我将不胜感激！我正在学习python，这很有趣，所以我想尽可能地变得更好！

编辑：使用下面有关异步的信息，我现在得到 1.7-2.1 它/秒哪个更好！能更快吗？也许我用错了？

import urllib.request
from tqdm import tqdm
import asyncio
with open("output.txt", "r") as file:
itemIDS = [line.strip() for line in file]
async def download():
x = 0
for length in tqdm(itemIDS):
await asyncio.sleep(1)
urllib.request.urlretrieve(
"https://imagemocksite.com?id="+str(itemIDS[x]), 
"images/"+str(itemIDS[x])+".jpg")
x += 1
asyncio.run(download())
print("All images downloaded")

评论已经提供了很好的建议，我认为你使用asyncio是正确的，这确实是这种工作的典型Python工具。

只是想带来一些帮助，因为您提供的代码并没有真正使用其功能。

首先，您必须安装异步处理 HTTP 请求和本地文件系统 I/O 的aiohttp和aiofiles。

然后，定义一个download(item_id, session)帮助程序协程，该协程根据其item_id下载单个映像。session将是一个aiohttp.ClientSession，它是在aiohttp中运行异步HTTP请求的基类。

诀窍是最后有一个download_all协程，它同时调用所有单个download()协程asyncio.gather。asyncio.gather告诉asyncio"并行"运行多个协程的方法。

这应该会大大加快您的下载速度。如果没有，那么是第三方服务器限制了您。

import asyncio
import aiohttp
import aiofiles

with open("output.txt", "r") as file:
itemIDS = [line.strip() for line in file]

async def download(item_id, session):
url = "https://imagemocksite.com"
filename = f"images/{item_id}.jpg"
async with session.get(url, {"id": item_id}) as response:
async with aiofiles.open(filename, "wb") as f:
await f.write(await response.read())

async def download_all():
async with aiohttp.ClientSession() as session:
await asyncio.gather(
*[download(item_id, session) for item_id in itemIDS]
)

asyncio.run(download_all())
print("All images downloaded")

Python - 循环浏览大型列表并快速下载图像

相关内容

最新更新

热门标签：