如何分解嵌套的json文件



我从api获得数据,其中有一个嵌套的Json文件。我写入文件,它创建如下:

{"country": "Afghanistan", "timeline": [{"total": 6355931, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]}
{"country": "Albania", "timeline": [{"total": 2883079, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]}
{"country": "Algeria", "timeline": [{"total": 15205854, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]}

我面临的问题是,我需要打破"时间轴",让我只剩下总,每日和日期,这样它看起来更像这样:-

{"country": "Afghanistan", "total": 6355931, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}
{"country": "Albania", "total": 2883079, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}
{"country": "Algeria", "total": 15205854, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}

我试过json_normalize没有工作,所以我想知道出了什么问题。代码是:-

def get_country_vaccines(self, last_days: int = 3, full_data = 'true') -> requests.Response:
return requests.get(self.host + 'vaccine/coverage/countries', params={'lastdays': last_days, 'fullData': full_data})
class VaccineData(CovidBase):
__tablename__ = 'vaccination_data'

base_value = Column(Integer, primary_key= True)
country = Column(String)
timeline = Column(JSON)
@classmethod
def from_requests(cls, request: dict):

return cls(
country=request.get('country'),
timeline = request.get('timeline') 
)
def to_bigquery_row(self):
return {
'country': self.country,
'timeline': self.timeline,
}
with open('covidinfo.json','w') as newfile:
response = get_country_vaccines('1')
for item in response.json():
data = sq_models.VaccineData.from_requests(item)
newfile.write(json.dumps(data.to_bigquery_row()))
newfile.write('n')
input(data)

我怎样才能在时间轴上分解信息,使它里面的每个字段现在是分开的?对不起,我对python相当陌生,所以只是在寻找一些帮助。

您可以遍历原始JSON数据作为Python字典:

d = {"country": "Afghanistan", "timeline": [{"total": 6355931, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]}
d1 = {"country":d["country"]}
for i in d["timeline"][0].keys():
d1[i] = d["timeline"][0][i]
print(d1)

输出:最终转换后的字典看起来像这样

{'country': 'Afghanistan', 'total': 6355931, 'daily': 0, 'totalPerHundred': 0, 'dailyPerMillion': 0, 'date': '6/24/22'}

如果您非常了解您的数据结构,并且您不想涉及任何魔法,我将只迭代原始数据并平坦时间轴条目。

一种可能的方法(假设timeline数组总是包含一个元素):

nested_data = [
{"country": "Afghanistan", "timeline": [{"total": 6355931, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]},
{"country": "Albania", "timeline": [{"total": 2883079, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]},
{"country": "Algeria", "timeline": [{"total": 15205854, "daily": 0, "totalPerHundred": 0, "dailyPerMillion": 0, "date": "6/24/22"}]}
]
flat_data = []
for nested_obj in nested_data:
flat_obj = {}
for key, value in nested_obj.items():
if key == "timeline":
for timeline_key, timeline_value in nested_obj[key][0].items():
flat_obj[timeline_key] = timeline_value
else:
flat_obj[key] = value
flat_data.append(flat_obj)
print(flat_data)

最新更新