使用pandas解析以number为键的JSON



我的问题类似于这个-解析JSON与数字为关键和这一个解析嵌套JSON除了我想解析我的JSON文件与pandas normalize。最终,我想把这个数据框写进一个excel文件(我有这样做的代码)。我想要在内存中的数据框应该像这样

Timestamp  BatteryVoltage GridCurrent GridVoltage InverterCurrent InverterVoltage 
....
....

我现有的代码对我一点帮助都没有-

import json
import datetime
import pandas as pd
from pandas.io.json import json_normalize
with open('test.json') as data_file:
data = json.load(data_file)


df = pd.json_normalize(data['timestamp'])

我知道我必须给一些东西作为json_normalize record_path的参数,但我不确定它会是什么,因为时间戳的值不断变化。

{"timestamp": {
"1636987025": {
"batteryVoltage": 28.74732,
"gridCurrent": 3.68084,
"gridVoltage": 230.64401,
"inverterCurrent": 2.00471,
"inverterVoltage": 224.18573,
"solarCurrent": 0,
"solarVoltage": 0,
"tValue": 1636987008
},
"1636987085": {
"batteryVoltage": 28.52959,
"gridCurrent": 3.40046,
"gridVoltage": 230.41367,
"inverterCurrent": 1.76206,
"inverterVoltage": 225.24319,
"solarCurrent": 0,
"solarVoltage": 0,
"tValue": 1636987136
},
"1636987146": {
"batteryVoltage": 28.5338,
"gridCurrent": 3.37573,
"gridVoltage": 229.27209,
"inverterCurrent": 2.11128,
"inverterVoltage": 225.51733,
"solarCurrent": 0,
"solarVoltage": 0,
"tValue": 1636987136
},
"1636987206": {
"batteryVoltage": 28.55535,
"gridCurrent": 3.43365,
"gridVoltage": 229.47604,
"inverterCurrent": 1.98594,
"inverterVoltage": 225.83649,
"solarCurrent": 0,
"solarVoltage": 0,
"tValue": 1636987264
}
}
}

pandas的json_normalize是为json/dicts中嵌入列表而设计的;你的数据没有。

一个选项是jmespath:

# pip install jmespath
expression = jmespath.compile("timestamp.*")
pd.DataFrame(expression.search(data)) 
batteryVoltage  gridCurrent  gridVoltage  inverterCurrent  inverterVoltage  solarCurrent  solarVoltage      tValue
0        28.74732      3.68084    230.64401          2.00471        224.18573             0             0  1636987008
1        28.52959      3.40046    230.41367          1.76206        225.24319             0             0  1636987136
2        28.53380      3.37573    229.27209          2.11128        225.51733             0             0  1636987136
3        28.55535      3.43365    229.47604          1.98594        225.83649             0             0  1636987264

查看jmespath的文档;这是一个很棒的工具,可以方便地处理json。

另一个选项,没有jmespath,只有纯python:

result = [value for _, value in data['timestamp'].items()]
pd.DataFrame(result)
batteryVoltage  gridCurrent  gridVoltage  inverterCurrent  inverterVoltage  solarCurrent  solarVoltage      tValue
0        28.74732      3.68084    230.64401          2.00471        224.18573             0             0  1636987008
1        28.52959      3.40046    230.41367          1.76206        225.24319             0             0  1636987136
2        28.53380      3.37573    229.27209          2.11128        225.51733             0             0  1636987136
3        28.55535      3.43365    229.47604          1.98594        225.83649             0             0  1636987264

捕获时间戳很容易:

# for python 3.9, you can use | instead for the dictionary merging
result = [{**value, **{'timestamp':key}} for key, value in data['timestamp'].items()]
pd.DataFrame(result)
batteryVoltage  gridCurrent  gridVoltage  inverterCurrent  inverterVoltage  solarCurrent  solarVoltage      tValue   timestamp
0        28.74732      3.68084    230.64401          2.00471        224.18573             0             0  1636987008  1636987025
1        28.52959      3.40046    230.41367          1.76206        225.24319             0             0  1636987136  1636987085
2        28.53380      3.37573    229.27209          2.11128        225.51733             0             0  1636987136  1636987146
3        28.55535      3.43365    229.47604          1.98594        225.83649             0             0  1636987264  1636987206

另一种选择:

pd.DataFrame.from_dict(data['timestamp'], orient='index').rename_axis('timestamp').reset_index()
timestamp  batteryVoltage  gridCurrent  gridVoltage  inverterCurrent  inverterVoltage  solarCurrent  solarVoltage      tValue
0  1636987025        28.74732      3.68084    230.64401          2.00471        224.18573             0             0  1636987008
1  1636987085        28.52959      3.40046    230.41367          1.76206        225.24319             0             0  1636987136
2  1636987146        28.53380      3.37573    229.27209          2.11128        225.51733             0             0  1636987136
3  1636987206        28.55535      3.43365    229.47604          1.98594        225.83649             0             0  1636987264

最新更新