将Pandas DF导出为嵌套的JSON(多重嵌套)



我想将Pandas-df导出到嵌套的JSON中,以便在Mongodb中接收。

以下是数据示例:

data = {
'product_id': ['a001','a001','a001'],
'product': ['aluminium','aluminium','aluminium'],
'production_id': ['b001','b002','b002'],
'production_name': ['metallurgical','recycle','recycle'],
'geo_name': ['US','EU','RoW'],
'value': [100, 200 ,200]
}
df = pd.DataFrame(data=data)
地理名称值b001美国欧盟回收
product_idproduction_id生产名称
a001冶金100
a001b002200
a001b002世界其他地区200

我找到了适用于无限数量嵌套(本例中为2(的最简单解决方案:

json_extract = df
.groupby(['product_id','product', 'production_id','production_name'])
.apply(lambda x: x[['geo_name','value']].to_dict('records'))
.reset_index(name='geos')
.groupby(['product_id','product'])
.apply(lambda x: x[['production_id','production_name', 'geos']].to_dict('records'))
.reset_index(name='production')
.to_json(orient='records')
[
{
"product_id": "a001",
"product": "aluminium",
"production": [
{
"production_id": "b001",
"production_name": "metallurgical",
"geos": [
{
"geo_name": "US",
"value": 100
}
]
},
{
"production_id": "b002",
"production_name": "recycle",
"geos": [
{
"geo_name": "EU",
"value": 200
},
{
"geo_name": "RoW",
"value": 200
}
]
}
]
}
]

最新更新