我有一个包含3个日期字段purchaseDate
、releaseDate
和ceaseDate
的DataFrame。数据帧的示例如下所示。
Product purchaseDate releaseDate ceaseDate
ABC 20/12/2020 01/01/2021 02/01/2022
ZXC 15/01/2021 05/01/2021 02/01/2022
QWE 29/03/2021 06/01/2021 02/01/2022
ASD 13/04/2021 07/01/2021 02/01/2022
如果purchaseDate
位于releaseDate
之间,则Active
的ceaseDate
输出应填充在新列Status
中。如果它purchaseDate
落在这两个日期之外,它应该显示为Inactive
。所需的输出如下所示。
Product purchaseDate releaseDate ceaseDate status
ABC 20/12/2020 01/01/2021 02/01/2022 Inactive
ZXC 04/01/2021 05/01/2021 02/01/2022 Inactive
QWE 29/03/2021 06/01/2021 02/01/2022 Active
ASD 13/04/2021 07/01/2021 02/01/2022 Active
如能提供任何协助,我们将不胜感激。
将日期列转换为datetime
类型并使用between
函数
date_columns = df.filter(regex='Date').columns
df[date_columns] = df[date_columns].apply(pd.to_datetime, format='%d/%m/%Y')
使用np.where
根据条件插入值
in_between = df.purchaseDate.between(df.releaseDate, df.ceaseDate)
df['status'] = np.where(in_between, 'Active', 'Inactive')
print(df)
输出
Product purchaseDate releaseDate ceaseDate status
0 ABC 2020-12-20 2021-01-01 2022-01-02 Inactive
1 ZXC 2021-01-15 2021-01-05 2022-01-02 Active
2 QWE 2021-03-29 2021-01-06 2022-01-02 Active
3 ASD 2021-04-13 2021-01-07 2022-01-02 Active
注意:不要忘记import numpy as np
转换为日期时间:
df = (df.assign(**df.filter(like='Date')
.transform(pd.to_datetime, format="%d/%m/%Y"))
)
Product purchaseDate releaseDate ceaseDate
0 ABC 2020-12-20 2021-01-01 2022-01-02
1 ZXC 2021-01-15 2021-01-05 2022-01-02
2 QWE 2021-03-29 2021-01-06 2022-01-02
3 ASD 2021-04-13 2021-01-07 2022-01-02
使用between
函数并将布尔输出映射到活动和非活动:
(df.assign(status = df.purchaseDate.between(df.releaseDate, df.ceaseDate)
.map({True:"Active", False:"Inactive"}))
)
Product purchaseDate releaseDate ceaseDate status
0 ABC 2020-12-20 2021-01-01 2022-01-02 Inactive
1 ZXC 2021-01-15 2021-01-05 2022-01-02 Active
2 QWE 2021-03-29 2021-01-06 2022-01-02 Active
3 ASD 2021-04-13 2021-01-07 2022-01-02 Active