小贝子编程

根据另一列pandas中特定值的累计出现次数创建一个新列

本文关键字：创建新列一个一列 pandas pandas cumulative-frequency
更新时间 : 2023-09-22
英文 : create a new column based on cumulative occurrences of a specific value in another column pandas

我想计算一个特定值(字符串)在一列中出现的次数，并在另一列中累计记录。

例如，计算Y值的累计个数:

col_1  new_col
Y        1
Y        2
N        2
Y        3
N        3

我写了这段代码，但它给我的是最终数字，而不是累计频率。

df['new_col'] = 0
df['new_col'] = df.loc[df.col_1 == 'Y'].count()

要累计计算这两个值，可以使用:

df['new_col'] = (df
.groupby('col_1')
.cumcount().add(1)
.cummax()
)

如果你想关注"Y":

df['new_col'] = (df
.groupby('col_1')
.cumcount().add(1)
.where(df['col_1'].eq('Y'))
.ffill()
.fillna(0, downcast='infer')
)

输出:

col_1  new_col
0     Y        1
1     Y        2
2     N        2
3     Y        3
4     N        3

df1.assign(new_col=df1.col_1.eq("Y").cumsum())

输出:

col_1  new_col
0     Y        1
1     Y        2
2     N        2
3     Y        3
4     N        3

相关内容