如何在Python中手动迭代for循环



对于带有嵌套while循环的循环,我得到了以下结果:

new_offset = 0
threshold = 600
limit = 20000
counter = 0
for query in all_queries:
print('query: ', query)
params = {"q": query, "license": "public", "imageType": "photo", "count": threshold}
while new_offset <= threshold:
print('Finding results for query: ', query)
params["offset"] = new_offset
response = requests.get(search_url, headers=headers, params=params)
response.raise_for_status()
search_results = response.json()
new_offset = search_results['nextOffset']
time.sleep(1)
while len(list(contentUrls)) < limit:
for i in search_results['value']:
contentUrls.append(i["contentUrl"])
print('contentUrls length', len(contentUrls))
original_query.append(search_results['queryContext']['originalQuery'])
query_result_name.append(i['name'])
query_name.append(query)
query_date.append(i['datePublished'])

这基本上是一个图像刮刀。新偏移的第一次while循环检查小于或等于阈值。这种逻辑基本上允许我在必应搜索中转到下一页。

第二个while循环收集要下载的图像的URL。在contentURL列表中只能有limit数量的URL。

我的问题是:在contentURLs到达limit后,如何让for query in all_queries跳到下一个查询?

如果我更了解您想要实现的目标,我可能会推荐一个更好的解决方案,但是,下面的示例应该按照您描述的方式工作。一旦长度contentUrls达到limit,它就跳到下一个查询。

我应该提到的是,这段代码不会更改limit或重置contentUrls列表,因此一旦它完成了第一个外部循环,它可能会在任何进一步的迭代中跳过内部while循环,因为列表已经达到了极限。

new_offset = 0
threshold = 600
limit = 20000
counter = 0

for query in all_queries:
print('query: ', query)
params = {"q": query, "license": "public", "imageType": "photo", "count": threshold}
print('Finding results for query: ', query)
params["offset"] = new_offset
response = requests.get(search_url, headers=headers, params=params)
response.raise_for_status()
search_results = response.json()
new_offset = search_results['nextOffset']
value_results = search_results['value']
if new_offset >= threshold:
break
time.sleep(1)
for i in range(min(limit, len(value_results))):
result = value_results[i]
contentUrls.append(result["contentUrl"])
original_query.append(search_results['queryContext']['originalQuery'])
query_result_name.append(result['name'])
query_name.append(query)
query_date.append(result['datePublished'])
print('contentUrls length', len(contentUrls))

最新更新