我有一个像下面这样的pandas数据框架。
id A B C
0 1 1 1 1
1 1 5 7 2
2 2 6 9 3
3 3 1 5 4
4 3 4 6 2
计算条件后,
id A B C a_greater_than_b b_greater_than_c c_greater_than_a
0 1 1 1 1 False False False
1 1 5 7 2 False True False
2 2 6 9 3 False True False
3 3 1 5 4 False True True
4 3 4 6 2 False True False
计算完条件后,想要聚合每个id的结果。
id a_greater_than_b b_greater_than_c c_greater_than_a
1 False False False
2 False True False
3 False True False
按id
分组,并使用all
聚合条件列
agg_cols = ['a_greater_than_b', 'b_greater_than_c', 'c_greater_than_a']
res = df.groupby('id')[agg_cols].all()
输出:
>>> df
id A B C a_greater_than_b b_greater_than_c c_greater_than_a
0 1 1 1 1 False False False
1 1 5 7 2 False True False
2 2 6 9 3 False True False
3 3 1 5 4 False True True
4 3 4 6 2 False True False
>>> res
a_greater_than_b b_greater_than_c c_greater_than_a
id
1 False False False
2 False True False
3 False True False
使用group by on id and all作为聚合函数
txt="""
1 1 1 1
1 5 7 2
2 6 9 3
3 1 5 4
3 4 6 2
"""
df = pd.DataFrame(columns=['id', 'A', 'B', 'C'])
for line in txt.split('n'):
if line.strip():
df.loc[len(df)] = line.split()
print(df)
df['a_greater_than_b']=df['A']>df['B']
df['b_greater_than_c']=df['B']>df['C']
df['c_greater_than_a']=df['C']>df['A']
grouped=df.groupby('id').agg({'a_greater_than_b': 'all', 'b_greater_than_c': 'all', 'c_greater_than_a': 'all'})
print(grouped)
输出:
a_greater_than_b b_greater_than_c c_greater_than_a
id
1 False False False
2 False True False
3 False True False