我想将Pandas-df导出到嵌套的JSON中,以便在Mongodb中接收。
以下是数据示例:
data = {
'product_id': ['a001','a001','a001'],
'product': ['aluminium','aluminium','aluminium'],
'production_id': ['b001','b002','b002'],
'production_name': ['metallurgical','recycle','recycle'],
'geo_name': ['US','EU','RoW'],
'value': [100, 200 ,200]
}
df = pd.DataFrame(data=data)
product_id | production_id | 生产名称 | 地理名称值|||
---|---|---|---|---|---|
a001 | 铝 | b001冶金 | 美国100 | ||
a001 | 铝 | b002 | 欧盟200 | ||
a001 | 铝 | b002 | 回收世界其他地区 | 200 |
我找到了适用于无限数量嵌套(本例中为2(的最简单解决方案:
json_extract = df
.groupby(['product_id','product', 'production_id','production_name'])
.apply(lambda x: x[['geo_name','value']].to_dict('records'))
.reset_index(name='geos')
.groupby(['product_id','product'])
.apply(lambda x: x[['production_id','production_name', 'geos']].to_dict('records'))
.reset_index(name='production')
.to_json(orient='records')
[
{
"product_id": "a001",
"product": "aluminium",
"production": [
{
"production_id": "b001",
"production_name": "metallurgical",
"geos": [
{
"geo_name": "US",
"value": 100
}
]
},
{
"production_id": "b002",
"production_name": "recycle",
"geos": [
{
"geo_name": "EU",
"value": 200
},
{
"geo_name": "RoW",
"value": 200
}
]
}
]
}
]