Pandas DataFrame不工作-需要通过日期json键进行过滤



这个JSON数据有一个密钥的日期,我想根据它进行筛选,但我不确定如何实现。我在网上尝试过其他场景,但都不起作用。提前感谢您的帮助。下面是我的代码。

import urllib.request
import pandas as pd
from tabulate import tabulate
import json
url = "https://www.recreation.gov/api/camps/availability/campsite" 
"/90329?start_date=2020-11-01T00%3A00%3A00.000Z&end_date=2021-12-31T00%3A00%3A00.000Z"
uh = urllib.request.urlopen(url)
data = uh.read()
js = json.loads(data.decode("utf-8"))
filter_by = ['availability']
df = pd.DataFrame(js)
df = df.filter(items=filter_by)
print(tabulate(df, headers=filter_by))

这是我的代码的输出。

availabilities         {'2020-11-01T00:00:00Z': 'Not Available', '2020-11-02T00:00:00Z': 'Not Available', '2020-11-03T00:00:00Z': 'Not Available', '2020-11-04T00:00:00Z': 'Not Available', '2020-11-05T00:00:00Z': 'Not Available', '2020-11-06T00:00:00Z': 'Not Available', '2020-11-07T00:00:00Z': 'Not Available', '2020-11-08T00:00:00Z': 'Reserved', '2020-11-09T00:00:00Z': 'Reserved', '2020-11-10T00:00:00Z': 'Reserved', '2020-11-11T00:00:00Z': 'Reserved', '2020-11-12T00:00:00Z': 'Reserved', '2020-11-13T00:00:00Z': 'Reserved', '2020-11-14T00:00:00Z': 'Reserved', '2020-11-15T00:00:00Z': 'Reserved', '2020-11-16T00:00:00Z': 'Reserved', '2020-11-17T00:00:00Z': 'Reserved', '2020-11-18T00:00:00Z': 'Reserved', '2020-11-19T00:00:00Z': 'Reserved', '2020-11-20T00:00:00Z': 'Reserved', '2020-11-21T00:00:00Z': 'Reserved', '2020-11-22T00:00:00Z': 'Reserved', '2020-11-23T00:00:00Z': 'Reserved', '2020-11-24T00:00:00Z': 'Reserved', '2020-11-25T00:00:00Z': 'Reserved', '2020-11-26T00:00:00Z': 'Reserved', '2020-11-27T00:00:00Z': 'Reserved', '2020-11-28T00:00:00Z': 'Reserved', '2020-11-29T00:00:00Z': 'Reserved', '2020-11-30T00:00:00Z': 'Reserved', '2020-12-01T00:00:00Z': 'Reserved', '2020-12-02T00:00:00Z': 'Reserved', '2020-12-03T00:00:00Z': 'Reserved', '2020-12-04T00:00:00Z': 'Reserved', '2020-12-05T00:00:00Z': 'Reserved', 

这就是我想输出的方式。

Date                      Available       
--------                  -----------  
2020-11-01T00:00:00Z    Not Available       
2020-11-02T00:00:00Z    Not Available
2020-11-03T00:00:00Z    Not Available
2020-11-04T00:00:00Z    Not Available
2020-11-05T00:00:00Z    Not Available

试试这个,

  1. CCD_ 1转换为df
  2. 然后替换以CCD_ 2开头的列名——这是因为CCD_
  3. 然后删除以非数字字符开头的列,基本上删除任何以非数字字母开头的列;It’这不是约会
  4. 最后将其转置并重命名列:

代码

js = json.loads(data.decode("utf-8"))
df = pd.json_normalize(js)
df.columns = df.columns.str.replace('availability.availabilities.', '')
df = df.loc[:, ~df.columns.str.contains(r'^[a-z]+')]
df = df.T
df.reset_index(inplace=True)
df.columns = ['Date', 'Available']
print(df)

Date      Available
0    2020-11-01T00:00:00Z  Not Available
1    2020-11-02T00:00:00Z  Not Available
2    2020-11-03T00:00:00Z  Not Available
3    2020-11-04T00:00:00Z  Not Available
4    2020-11-05T00:00:00Z  Not Available
..                    ...            ...
192  2021-05-12T00:00:00Z           Open
193  2021-05-13T00:00:00Z           Open
194  2021-05-14T00:00:00Z           Open
195  2021-05-15T00:00:00Z           Open
196  2021-05-16T00:00:00Z           Open

最新更新