有没有一种方法可以通过传递硬编码列表来参数化熊猫组?
group_by_cols = "id","week_number"
aggregate_cols = "col1","col2","col3"
df = pd.read_csv(input_file_name)
df_total = df.groupby([group_by_cols])[aggregate_cols].sum()
这可能吗?
如果需要传递列表,请从嵌套列表的[group_by_cols]
中删除[]
:
#for list added []
group_by_cols = ["id","week_number"]
aggregate_cols = ["col1","col2","col3"]
print (type(group_by_cols))
<class 'list'>
df = pd.read_csv(input_file_name)
df_total = df.groupby(group_by_cols)[aggregate_cols].sum()
或者,如果输入是元组,则将它们转换为如下列表:
group_by_cols = "id","week_number"
aggregate_cols = "col1","col2","col3"
像传递元组一样工作:
group_by_cols = ("id","week_number")
aggregate_cols = ("col1","col2","col3")
print (type(group_by_cols))
<class 'tuple'>
df = pd.read_csv(input_file_name)
df_total = df.groupby(list(group_by_cols))[list(aggregate_cols)].sum()
样本数据测试:
df = pd.DataFrame({
'id':list('aaaabb'),
'week_number':[4,5,4,5,5,5],
'col1':[7,8,9,4,2,3],
'col2':[1,3,5,7,1,0],
'col3':[5,3,6,9,2,4],
'col4':[4,3,3,0,3,9]
})
group_by_cols = ["id","week_number"]
aggregate_cols = ["col1","col2","col3"]
df_total = df.groupby(group_by_cols)[aggregate_cols].sum()
print (df_total)
col1 col2 col3
id week_number
a 4 16 6 11
5 12 10 12
b 5 5 1 6
group_by_cols = "id","week_number"
aggregate_cols = "col1","col2","col3"
df_total = df.groupby(list(group_by_cols))[list(aggregate_cols)].sum()
print (df_total)
col1 col2 col3
id week_number
a 4 16 6 11
5 12 10 12
b 5 5 1 6