Python Pandas -将数据帧转换为json



我有这个pandas.dataframe:

date.               pid value   interval
0   2021-09-05 00:04:24 1       5.554   2021-09-05 00:00:00
1   2021-09-05 00:06:38 1       4.359   2021-09-05 00:05:00
2   2021-09-05 00:06:46 1       18.364  2021-09-05 00:05:00
3   2021-09-05 00:04:24 2       15.554  2021-09-05 00:00:00
4   2021-09-05 00:06:38 2       3.359   2021-09-05 00:05:00
5   2021-09-05 00:06:46 2       10.364  2021-09-05 00:05:00

我想把它变成JSON,像这样:

{
"2021-09-05 00:00:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 5.554,
},
],
"pid2": [
{
"date": "2021-09-05 00:04:24",
"pid": 2,
"value": 15.554,
}
],
},
"2021-09-05 00:05:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 4.359,
},
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 18.364,
},
],
"pid2": [
{
"date": "2021-09-05 00:06:38",
"pid": 2,
"value": 3.359,
},{
"date": "2021-09-05 00:06:46",
"pid": 1,
"value": 10.364,
},
],
}
}

基本上我想按interval值分组数据。是否有一种快速的方法来格式化它?

pid创建辅助列,转换为MultiIndex Series和最后一个crate嵌套字典:

s = (df.assign(new = 'pid' + df['pid'].astype(str))
.groupby(['interval','new'])[['date','pid','value']]
.apply(lambda x : x.to_dict(orient= 'records')))
d = {level: s.xs(level).to_dict() for level in s.index.levels[0]}

print (d)
{
'2021-09-05 00:00:00': {
'pid1': [{
'date': '2021-09-05 00:04:24',
'pid': 1,
'value': 5.554
}],
'pid2': [{
'date': '2021-09-05 00:04:24',
'pid': 2,
'value': 15.554
}]
},
'2021-09-05 00:05:00': {
'pid1': [{
'date': '2021-09-05 00:06:38',
'pid': 1,
'value': 4.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 1,
'value': 18.364
}
],
'pid2': [{
'date': '2021-09-05 00:06:38',
'pid': 2,
'value': 3.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 2,
'value': 10.364
}
]
}
}

最后一次使用:

import json
json = json.dumps(d)

最新更新