我有这个pandas.dataframe:
date. pid value interval
0 2021-09-05 00:04:24 1 5.554 2021-09-05 00:00:00
1 2021-09-05 00:06:38 1 4.359 2021-09-05 00:05:00
2 2021-09-05 00:06:46 1 18.364 2021-09-05 00:05:00
3 2021-09-05 00:04:24 2 15.554 2021-09-05 00:00:00
4 2021-09-05 00:06:38 2 3.359 2021-09-05 00:05:00
5 2021-09-05 00:06:46 2 10.364 2021-09-05 00:05:00
我想把它变成JSON
,像这样:
{
"2021-09-05 00:00:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 5.554,
},
],
"pid2": [
{
"date": "2021-09-05 00:04:24",
"pid": 2,
"value": 15.554,
}
],
},
"2021-09-05 00:05:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 4.359,
},
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 18.364,
},
],
"pid2": [
{
"date": "2021-09-05 00:06:38",
"pid": 2,
"value": 3.359,
},{
"date": "2021-09-05 00:06:46",
"pid": 1,
"value": 10.364,
},
],
}
}
基本上我想按interval
值分组数据。是否有一种快速的方法来格式化它?
用pid
创建辅助列,转换为MultiIndex Series
和最后一个crate嵌套字典:
s = (df.assign(new = 'pid' + df['pid'].astype(str))
.groupby(['interval','new'])[['date','pid','value']]
.apply(lambda x : x.to_dict(orient= 'records')))
d = {level: s.xs(level).to_dict() for level in s.index.levels[0]}
print (d)
{
'2021-09-05 00:00:00': {
'pid1': [{
'date': '2021-09-05 00:04:24',
'pid': 1,
'value': 5.554
}],
'pid2': [{
'date': '2021-09-05 00:04:24',
'pid': 2,
'value': 15.554
}]
},
'2021-09-05 00:05:00': {
'pid1': [{
'date': '2021-09-05 00:06:38',
'pid': 1,
'value': 4.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 1,
'value': 18.364
}
],
'pid2': [{
'date': '2021-09-05 00:06:38',
'pid': 2,
'value': 3.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 2,
'value': 10.364
}
]
}
}
最后一次使用:
import json
json = json.dumps(d)