将Pandas Dataframe转换为表记录的嵌套JSON



我正在使用Python和Pandas。试图将Pandas数据框转换为嵌套JSON。函数。to_json()没有给我足够的灵活性来实现我的目标。

以下是数据帧的一些数据点(CSV格式,逗号分隔):

Hotel_id,Room_id,Client_id,Loayalty_level,Price
1,100,1,Default,100
1,100,2,Default,98
1,101,1,Default,200
1,101,1,Discounted,196
1,101,2,Default,202
1,101,3,Default,204

有很多重复的信息,我希望有一个这样的JSON:

{
"hotelId": 1,
"rooms": [
{
"roomId": 100,
"prices": [
{
"clientId": 1,
"price": {
"default": 100
}
},
{
"clientId": 2,
"price": {
"default": 98
}
}
]
},
{
"roomId": 101,
"prices": [
{
"clientId": 1,
"price": {
"default": 200,
"discounted": 196
}
},
{
"clientId": 2,
"price": {
"default": 202
}
},
{
"clientId": 3,
"price": {
"default": 204
}
}
]
}
]
}

如何实现这一点?

看一下convtools库,它为数据处理提供了很多原语。以下是针对您的情况的解决方案:

import json
from convtools import conversion as c
from convtools.contrib.tables import Table

input_data = [
("Hotel_id", "Room_id", "Client_id", "Loayalty_level", "Price"),
("1", "100", "1", "Default", "100"),
("1", "100", "2", "Default", "98"),
("1", "101", "1", "Default", "200"),
("1", "101", "1", "Discounted", "196"),
("1", "101", "2", "Default", "202"),
("1", "101", "3", "Default", "204"),
]
# if reading from csv is needed
# rows = Table.from_csv("tmp/input.csv", header=True).into_iter_rows(tuple)
# convert to list of dicts
rows = list(Table.from_rows(input_data, header=True).into_iter_rows(dict))
# generate the converter (store somewhere and reuse, because this is where
# code-generation happens)
converter = (
c.group_by(c.item("Hotel_id"))
.aggregate(
{
"hotelId": c.item("Hotel_id").as_type(int),
"rooms": c.ReduceFuncs.Array(c.this()).pipe(
c.group_by(c.item("Room_id")).aggregate(
{
"roomId": c.item("Room_id").as_type(int),
"prices": c.ReduceFuncs.Array(c.this()).pipe(
c.group_by(c.item("Client_id")).aggregate(
{
"clientId": c.item("Client_id").as_type(
int
),
"price": c.ReduceFuncs.DictFirst(
c.item("Loayalty_level"),
c.item("Price").as_type(float),
),
}
)
),
}
)
),
}
)
.gen_converter()
)
print(json.dumps(converter(rows)))

输出为:

[
{
"hotelId": 1,
"rooms": [
{
"roomId": 100,
"prices": [
{ "clientId": 1, "price": { "Default": 100.0 } },
{ "clientId": 2, "price": { "Default": 98.0 } }
]
},
{
"roomId": 101,
"prices": [
{ "clientId": 1, "price": { "Default": 200.0, "Discounted": 196.0 } },
{ "clientId": 2, "price": { "Default": 202.0 } },
{ "clientId": 3, "price": { "Default": 204.0 } }
]
}
]
}
]

注:注意c.ReduceFuncs.DictFirst部分,这是每个忠诚度级别的第一个价格,您可能希望将其更改为DictLast/DictMax/DictMin/DictArray

最新更新