Python-Pandas:我想要按用户和日期分组,范围在(-7)天



我从pandas dataframe

查找日期范围(-7)内的用户数量的例子。

<表类>UserID日期(Y/M/D)tbody><<tr>1002021/02/151002021/02/101002021/02/81012021/02/101022021/02/151032021/02/10

使用定制lambda函数:

#convert to datetimes
df['Date (Y/M/D)'] = pd.to_datetime(df['Date (Y/M/D)'])
#7 days timedelta
t = pd.Timedelta(7, unit='d')
#for each group counts values between previous 7 days and original
f = lambda x: x.apply(lambda y: (x.between(y - t, y).sum()))
df['new'] = df.groupby('UserID')['Date (Y/M/D)'].apply(f)
print (df)
UserID Date (Y/M/D)  new
0     100   2021-02-15    3
1     100   2021-02-10    2
2     100   2021-02-08    1
3     101   2021-02-10    1
4     102   2021-02-15    1
5     103   2021-02-10    1

首先将日期列从字符串转换为日期时间(如果您以前没做过):

df['Date (Y/M/D)'] = pd.to_datetime(df['Date (Y/M/D)'])

然后只取最近7天的行:

df[df['Date (Y/M/D)'] >= pd.Timestamp.today().normalize() - pd.offsets.Day(7)]

要生成Count列,运行:

df['Count'] = df.groupby('UserID', group_keys=False).apply(
lambda x: pd.Series(len(x) - np.arange(len(x)), x.index))

结果是:

UserID Date (Y/M/D)  Count
0     100   2021-02-15      3
1     100   2021-02-10      2
2     100   2021-02-08      1
3     101   2021-02-10      1
4     102   2021-02-15      1
5     103   2021-02-10      1

相关内容

最新更新