在 python 抓取请求上重试备用终结点



我正在运行以下代码,它似乎工作正常。但是,我请求的URL是 https://api.gov.au/definitions/api/definition/fs/,我知道这有时会失败,因为对于给定的概念,正确的URL可能会以 https://api.gov.au/definitions/api/definition/trc/结尾

我想做的是尝试/fs url,如果这不起作用,请尝试/trc url,如果这不起作用,那么只需打印一条空记录,并将概念中的输入值作为第一列。

import numpy as np
import pandas as pd
import requests
import json
conceptCSV = pd.read_csv("Concepts.csv")
conceptCSV.columns = ["type", "id", "name"]
concepts = list(conceptCSV.id)
concept_list = []

for concept in concepts:
JSONContent = requests.get("https://api.gov.au/definitions/api/definition/fs/" + concept.lower()).json()['content']
if 'error' not in JSONContent:
concept_list.append([
JSONContent['name'],
JSONContent['domain'],
JSONContent['status'],
JSONContent['definition'],
JSONContent['guidance'],
JSONContent['identifier'],
JSONContent['type'],
JSONContent['domainAcronym'],
JSONContent['sourceURL']
])

dataset = pd.DataFrame(concept_list)
dataset.columns = ['name', 'domain', 'status', 'definition',
'guidance', 'identifier', 'type', 'domainAcronym', 'sourceURL']

dataset.to_csv("conceptDetails.csv", index=False)

非常感谢 迈克尔

如果您知道任一 url 都可以正常工作,则可以尝试以下操作。Offcource您可以插入一些装饰器,但在您的情况下,这似乎有点矫枉过正。

for concept in concepts:
r = requests.get("https://api.gov.au/definitions/api/definition/fs/" + concept.lower())
if r.status_code == 200:
JSONContent = r.json()['content']
else:
r = requests.get("https://api.gov.au/definitions/api/definition/trc/" + concept.lower())
if r.status_code == 200:
JSONContent = r.json()['content']
if JSONContent:
if 'error' not in JSONContent:
concept_list.append([
JSONContent['name'],
JSONContent['domain'],
JSONContent['status'],
JSONContent['definition'],
JSONContent['guidance'],
JSONContent['identifier'],
JSONContent['type'],
JSONContent['domainAcronym'],
JSONContent['sourceURL']
])

相关内容

  • 没有找到相关文章

最新更新