数据框的嵌套字典,内部字典包含一个pandas系列作为值



我在尝试创建一个嵌套字典时遇到了麻烦,其中内巢需要一系列的值。

下面是一个简单的数据框架:

import pandas as pd
import random
catGrpAll = ['Category_A']*3 + ['Category_B']*3
catGrpAll = catGrpAll*4
codeGrpAll = ['code1','code2','code3']
codeGrpAll = codeGrpAll*8
dateGrpAll = [pd.to_datetime('2021-03-31')]*6 + [pd.to_datetime('2021-04-30')]*6 +
[pd.to_datetime('2021-05-31')]*6 + [pd.to_datetime('2021-06-30')]*6
random.seed(0)
numAll = [ random.randint(100, 5000) for _ in range(24)]

df = pd.DataFrame(data={'Category':catGrpAll,
'Code':codeGrpAll,
'Time':dateGrpAll,
'Amount':numAll})                    
del catGrpAll,codeGrpAll,dateGrpAll,numAll

#   Column    Non-Null Count  Dtype         
---  ------    --------------  -----         
0   Category  24 non-null     object        
1   Code      24 non-null     object        
2   Time      24 non-null     datetime64[ns]
3   Amount    24 non-null     int64 
df.head()
Out[294]: 
Category   Code       Time  Amount
0  Category_A  code1 2021-03-31    3255
1  Category_A  code2 2021-03-31    3545
2  Category_A  code3 2021-03-31     431
3  Category_B  code1 2021-03-31    2221
4  Category_B  code2 2021-03-31    4288

我希望得到这样的结果:第一个键值对是Category-Code内部字典将是Code-Series

nested_dict = { 
'Category_A': [
{ 'code1': Series(Time/Amount),
'code2': Series(Time/Amount),
'code2': Series(Time/Amount) }
],
'Category_B': [
{ 'code1': Series(Time/Amount),
'code2': Series(Time/Amount),
'code2': Series(Time/Amount) }
]
}

如有任何帮助,不胜感激

######################## 更新 ########################################这里有一个例子,我希望如何字典看起来,但想知道是否有一种方法来避免循环?

data = {}
category = df.Category.unique()
code = df.Code.unique()
for i in category:
data[i] = {}
for j in code:
data[i][j] = []   

for i in category:
for j in code:
data[i][j] = df[(df.Category == i) & (df.Code == j)]
data[i][j].index = data[i][j]['Time']
data[i][j] = data[i][j]['Amount']

我没有看到任何内置函数提供您想要的输出。最接近的是df.to_records(orient='index')

您可以手动构建结果字典:

from collections import defaultdict
result = defaultdict(list)
for category, group in df.groupby('Category'):
result[category].append({
code: subgroup['Time'].to_list()
for code, subgroup in group.groupby('Code')
})

相关内容

  • 没有找到相关文章

最新更新