条件日期筛选器



我正在尝试将一系列时间戳分成几组:

定义变量:

Very old = Date < '20190101'
Current = Today's date as %Y-%m (Year-Month)

条件

1. timestamp < very old
2. Very old < timestamp < current
3. timestamp = current
4. timestamp > current

从原始数据帧中分离出来的系列:

timestamp_dict = 
{0: Timestamp('2019-05-01 00:00:00'),
 1: Timestamp('2019-05-01 00:00:00'),
 2: Timestamp('2018-12-01 00:00:00'),
 3: Timestamp('2019-05-01 00:00:00'),
 4: Timestamp('2019-05-01 00:00:00'),
 5: Timestamp('2019-05-01 00:00:00'),
 6: Timestamp('2019-04-01 00:00:00'),
 7: Timestamp('2019-08-01 00:00:00')}

日期时间存储为 datetime64[ns]。我有一种将当前时间戳转换为 str 的感觉是错误的,但是,我不确定如何将当前时间戳提取为格式%Y-%m.

有一个将当前日期访问为月、年整数然后连接的想法,但随后我可能会遇到零填充问题:

_month = dt.datetime.today().month
_year = dt.datetime.today().year
# Would run into zero padding for months 1-9:
current = str(_year) + str(_month)  

在这里,我尝试使用 np.select 生成一个新的数据帧列并指定所需的条件。

import datetime as dt
current = dt.datetime.today().strftime('%Y-%m')
veryold = '20190101'
conditions = [
    df.Delivery < veryold,
    (df.Delivery >= veryold | (df.Delivery < current),
    df.Delivery == current,
    df.Delivery > current
]
outcome = [
    'Very old',
    'Old',
    'Current',
    'Future'
]
df['New'] = np.select(conditions, outcome)
df.New

我的预期输出是在我的数据帧中增加一列标记的结果。

想法是按Series.dt.to_period创建月期,以便按YYYY-MM进行比较:

current = pd.Timestamp(pd.datetime.today()).to_period('M')
veryold = pd.Timestamp('20190101')
conditions = [
    df.Delivery < veryold,
    (df.Delivery >= veryold) | (df.Delivery.dt.to_period('M') < current),
    df.Delivery.dt.to_period('M') == current,
    df.Delivery.dt.to_period('M') > current]
outcome = [
    'Very old',
    'Old',
    'Current',
    'Future'
]
df = pd.Series(pd.Timestamp_dict).to_frame('Delivery')
df['New'] = np.select(conditions, outcome)
print(df)
    Delivery       New
0 2019-05-01       Old
1 2019-05-01       Old
2 2018-12-01  Very old
3 2019-05-01       Old
4 2019-05-01       Old
5 2019-05-01       Old
6 2019-04-01       Old
7 2019-08-01       Old

最新更新