通过URL循环,对json结果进行分组



我试图通过一个url循环并使用一个变量来解析多个营地的json数据。我想按UnitId对结果进行分组,但循环不会为每个营地运行。

我试过使用请求来做这件事,然后在阅读后,使用熊猫似乎更好,但我就是无法让它发挥作用。

这是我的代码示例。由于某些原因,只显示UnitId 5098的结果,过滤器将不起作用。

# Is one of these campsites available
# Unit 5095
# Unit 5096
# Unit 5097
# Unit 5099
import requests
import pandas as pd
result = []
for i in range(5095, 5099):
resp = requests.get("https://calirdr.usedirect.com/rdr/rdr/fd/availability/getbyunit/"+str(i)+"/startdate/2020-11-01/nights/30/true?")
result.extend(resp.json())
df = pd.DataFrame(result)
df.groupby(['UnitId', 'StartTime', 'IsFree', 'IsWalkin'])
print(df.groupby(['UnitId', 'StartTime', 'IsFree', 'IsWalkin']).groups)

JSON数据来自我的代码。

{(5098, '2020-11-01T00:00:00', False, False): [0], (5098, '2020-11-02T00:00:00', False, False): [1], (5098, '2020-11-03T00:00:00', False, False): [2], (5098, '2020-11-04T00:00:00', False, False): [3], (5098, '2020-11-05T00:00:00', False, False): [4], (5098, '2020-11-06T00:00:00', False, False): [5], (5098, '2020-11-07T00:00:00', False, False): [6], (5098, '2020-11-08T00:00:00', False, False): [7], (5098, '2020-11-09T00:00:00', False, False): [8], (5098, '2020-11-10T00:00:00', False, False): [9], (5098, '2020-11-11T00:00:00', False, False): [10], (5098, '2020-11-12T00:00:00', False, False): [11], (5098, '2020-11-13T00:00:00', False, False): [12], (5098, '2020-11-14T00:00:00', False, False): [13], (5098, '2020-11-15T00:00:00', False, False): [14], (5098, '2020-11-16T00:00:00', False, False): [15], (5098, '2020-11-17T00:00:00', False, False): [16], (5098, '2020-11-18T00:00:00', False, False): [17], (5098, '2020-11-19T00:00:00', False, False): [18], (5098, '2020-11-20T00:00:00', False, False): [19], (5098, '2020-11-21T00:00:00', False, False): [20], (5098, '2020-11-22T00:00:00', False, False): [21], (5098, '2020-11-23T00:00:00', False, False): [22], (5098, '2020-11-24T00:00:00', False, False): [23], (5098, '2020-11-25T00:00:00', False, False): [24], (5098, '2020-11-26T00:00:00', False, False): [25], (5098, '2020-11-27T00:00:00', False, False): [26], (5098, '2020-11-28T00:00:00', False, False): [27], (5098, '2020-11-29T00:00:00', False, False): [28], (5098, '2020-11-30T00:00:00', False, False): [29]}

结果我想打印出来。这只是一个例子,很难找到真正可用的露营地。谢谢大家的帮助。这只是一个爱好,我不是一个程序员,但它很酷,尤其是Python。非常感谢。

(5095, '2020-11-05T00:00:00', False, False):
(5096, '2020-11-12T00:00:00', False, False):
(5099, '2020-11-25T00:00:00', False, False):

您必须将result.extend(resp.json())缩进到for loop正文中。此外,您可能需要考虑使用filter()。例如:

from datetime import datetime
import pandas as pd
import requests
from tabulate import tabulate
result = []
for unit_id in range(5095, 5099):
resp = requests.get(
f"https://calirdr.usedirect.com/rdr/rdr/fd/"
f"availability/getbyunit/{unit_id}/startdate/2020-11-01/nights/30/true?").json()
result.extend(resp)
filter_by = ['UnitId', 'StartTime', 'IsFree', 'IsWalkin']
df = pd.DataFrame(result)
df = df.filter(items=filter_by)
df['StartTime'] = df['StartTime'].apply(lambda d: datetime.fromisoformat(d).strftime("%Y-%m-%d"))
df = df[df['IsFree']]
print(tabulate(df, headers=filter_by))

输出:

UnitId  StartTime    IsFree    IsWalkin
--  --------  -----------  --------  ----------
60      5097  2020-11-01   True      False
78      5097  2020-11-19   True      False
87      5097  2020-11-28   True      False

最新更新