如何将具有多个键值对的字典转换为DataFrame



我尝试用以下代码清理数据

empty = {}
mess = lophoc_clean.query("lop_diemquatrinh.notnull()")[['lop_id', 'lop_diemquatrinh']]
keys = []
values = []
for index, rows  in mess.iterrows():
if len(rows['lop_diemquatrinh']) >4:
values.append(rows['lop_diemquatrinh'])
keys.append(rows['lop_id'])
df = pd.DataFrame(dict(zip(keys, values)), index = [0]).transpose()
df.columns = ['data']

结果是一个像这样的字典

{'data': {37: '[{"date_update":"31-03-2022","diemquatrinh":"6.0"}]',
38: '[{"date_update":"11-03-2022","diemquatrinh":"6.25"}]',
44: '[{"date_update":"25-12-2021","diemquatrinh":"6.0"},{"date_update":"28-04-2022","diemquatrinh":"6.25"},{"date_update":"28-07-2022","diemquatrinh":"6.5"}]',
1095: '[{"date_update":null,"diemquatrinh":null}]'}}

然而,我不知道如何使他们成为一个3列这样的DataFrame。请帮帮我。谢谢你!

id strong>updated_at strong>diemquatrinh strong>11-03-20226.2525-12-20216.028-04-20226.2528-07-20226.5

给你

from json import loads
from pprint import pp
import pandas as pd

def get_example_data():
return [
dict(id=38, updated_at="2022-03-11", diemquatrinh=6.25),
dict(id=44, updated_at="2021-12-25", diemquatrinh=6),
dict(id=44, updated_at="2022-04-28", diemquatrinh=6.25),
dict(id=1095, updated_at=None),
]

df = pd.DataFrame(get_example_data())
df["updated_at"] = pd.to_datetime(df["updated_at"])
print(df.dtypes, "n")
pp(loads(df.to_json()))
print()
print(df, "n")
pp(loads(df.to_json(orient="records")))

输出如下:

id                       int64
updated_at      datetime64[ns]
diemquatrinh           float64
dtype: object 
{'id': {'0': 38, '1': 44, '2': 44, '3': 1095},
'updated_at': {'0': 1646956800000,
'1': 1640390400000,
'2': 1651104000000,
'3': None},
'diemquatrinh': {'0': 6.25, '1': 6.0, '2': 6.25, '3': None}}
id updated_at  diemquatrinh
0    38 2022-03-11          6.25
1    44 2021-12-25          6.00
2    44 2022-04-28          6.25
3  1095        NaT           NaN 
[{'id': 38, 'updated_at': 1646956800000, 'diemquatrinh': 6.25},
{'id': 44, 'updated_at': 1640390400000, 'diemquatrinh': 6.0},
{'id': 44, 'updated_at': 1651104000000, 'diemquatrinh': 6.25},
{'id': 1095, 'updated_at': None, 'diemquatrinh': None}]

任意一个JSON数据结构是可以接受的输入用于从头创建一个新的DataFrame。

相关内容

  • 没有找到相关文章

最新更新