Pandas DataFrame中用于圆形封装的特定嵌套字典



我正试图在Pandas条件下从DataFrame创建一个特定的嵌套字典,以便可视化。

dat = pd.DataFrame({'cat_1' : ['marketing', 'marketing', 'marketing', 'communications'],
'child_cat' : ['marketing', 'social media', 'marketing', 'communications],
'skill' : ['digital marketing','media marketing','research','seo'],
'value' : ['80', '101', '35', '31']

我想把它变成一本字典,看起来有点像:

{
"name": "general skills",
"children": [
{
"name": "marketing",
"children": [
{
"name": "marketing",
"children": [
{
"name": "digital marketing",
"value": 80
},
{
"name": "research",
"value": 35
}
]
},
{
"name": "social media", // notice that this is a sibling of the parent marketing
"children": [
{
"name": "media marketing",
"value": 101
}
]
}
]
},
{
"name": "communications",
"children": [
{
"name": "communications",
"children": [
{
"name": "seo",
"value": 31
}
]
}
]
}
]
}

因此,cat_1是父节点,child_cat是其子节点,而skill也是其子节点。我在创建其他children列表时遇到问题。有什么帮助吗?

由于效率低下,我提出了这个解决方案。可能高度次优

final = {}
# control dict to get only one broad category
contrl_dict = {}
contrl_dict['dummy'] = None
final['name'] = 'variants'
final['children'] = []
# line is the values of each row
for idx, line in enumerate(df_dict.values):
# parent categories dict
broad_dict_1 = {}
print(line)
# this takes every value of the row minus the value in the end
for jdx, col in enumerate(line[:-1]):
# look into the broad category first
if jdx == 0:
# check in our control dict - does this category exist? if not add it and continue
if not col in contrl_dict.keys():
# if it doesn't it appends it
contrl_dict[col] = 'added'
# then the broad dict parent takes the name
broad_dict_1['name'] = col
# the children are the children broad categories which will be populated further
broad_dict_1['children'] = []
# go to broad categories 2
for ydx, broad_2 in enumerate(list(df_dict[df_dict.broad_categories == col].broad_2.unique())):
# sub categories dict
prov_dict = {}
prov_dict['name'] = broad_2
# children is again a list
prov_dict['children'] = []
# now isolate the skills and values of each broad_2 category and append them
for row in df_dict[df_dict.broad_2 == broad_2].values:
prov_d_3 = {}
# go to each row
for xdx, direct in enumerate(row):
# in each row, values 2 and 3 are name and value respectively add them
if xdx == 2:
prov_d_3['name'] = direct
if xdx == 3:
prov_d_3['size'] = direct
prov_dict['children'].append(prov_d_3)

broad_dict_1['children'].append(prov_dict)

# if it already exists in the control dict then it moves on
else:
continue
final['children'].append(broad_dict_1)

最新更新