从本季度开始到本月底,筛选pandas数据帧中数据所属的行



我正在尝试拟合数据帧以查找季度迄今(QTD(行。在下面的数据中,我的年份从二月开始,所以当我说QTD时,我的意思是:

Quarter Months
1       Feb, Mar, Apr
2       May, Jun, Jul
3       Aug, Sep, Oct
4       Nov, Dec, Jan
Sample Dataframe:
Quarter Month   Data    Value
1       1       A       100             
1       2       B       134             
1       3       C       145             
2       4       D       156             
2       5       E       167             
2       6       F       178             
3       7       G       123             
3       8       H       112             
3       9       I       187             
4       10      J       132             
4       11      K       109             
4       12      L       121             

对于当前的情况,假设我的当前月份是9月,经过过滤的数据应该只包含8月至9月的行。

我可以通过使用以下功能来识别季度,但这是从今年1月开始的

def current_quarter(dt):
prev_quarter_map = ((4, -1), (1, 0), (2, 0), (3, 0))
quarter, yd = prev_quarter_map[(dt.month - 1) // 3]
return (quarter)

有没有办法只过滤那些从本季度开始到本月结束的行?

想法是从February开始按季度创建字典,然后按月份使用Series.map,并按日期时间boolean indexing过滤now从字典dq:转换为您的季度

q = [[2,3,4],[5,6,7],[8,9,10],[11,12,1]]
dq = {x: k for k, v in enumerate(q, 1) for x in v}
print (dq)
{2: 1, 3: 1, 4: 1, 5: 2, 6: 2, 7: 2, 8: 3, 9: 3, 10: 3, 11: 4, 12: 4, 1: 4}
now = dq[pd.to_datetime('now').month]
print (now)
3
df1 = df[df['Month'].map(dq) == now]
print (df1)
Quarter  Month Data  Value
7        3      8    H    112
8        3      9    I    187
9        4     10    J    132

如果需要通过其他日期时间进行筛选:

date = datetime.date(2015, 1, 13)
now = dq[date.month]
print (now)
4
df1 = df[df['Month'].map(dq) == now]
print (df1)
Quarter  Month Data  Value
0         1      1    A    100
10        4     11    K    109
11        4     12    L    121

编辑:在上面的解决方案中,没有区分季度的年份,因此添加了tseries.offsets.QuarterBegin:的新解决方案

#add year column
print (df)
Quarter  Month Data  Value  Year
0         1      1    A    100  2020
1         1      2    B    134  2020
2         1      3    C    145  2020
3         2      4    D    156  2020
4         2      5    E    167  2020
5         2      6    F    178  2020
6         3      7    G    123  2020
7         3      8    H    112  2020
8         3      9    I    187  2020
9         4     10    J    132  2020
10        4     11    K    109  2020
11        4     12    L    121  2020
#convert columns to datetimes and convert to datetime for start oq quarter
df['Q'] = (pd.to_datetime(df[['Month','Year']].assign(Day=1)) + 
pd.offsets.QuarterBegin(0, startingMonth=2))
print (df)
Quarter  Month Data  Value  Year          Q
0         1      1    A    100  2020 2020-02-01
1         1      2    B    134  2020 2020-02-01
2         1      3    C    145  2020 2020-05-01
3         2      4    D    156  2020 2020-05-01
4         2      5    E    167  2020 2020-05-01
5         2      6    F    178  2020 2020-08-01
6         3      7    G    123  2020 2020-08-01
7         3      8    H    112  2020 2020-08-01
8         3      9    I    187  2020 2020-11-01
9         4     10    J    132  2020 2020-11-01
10        4     11    K    109  2020 2020-11-01
11        4     12    L    121  2020 2021-02-01

还将QuarterBegin添加到日期时间和最后一个日期:

date = datetime.date(2020, 1, 13)
custom_q = (date + pd.offsets.QuarterBegin(0, startingMonth=2))
print (custom_q)
2020-02-01 00:00:00

df1 = df[df['Q'] == custom_q]
print (df1)
Quarter  Month Data  Value  Year          Q
0        1      1    A    100  2020 2020-02-01
1        1      2    B    134  2020 2020-02-01

最新更新