我的公司使用5-4-4日历进行报告。每个月(又名周期(长4周,除了第一个月长5周。
Pandas似乎对自定义日历周期有很好的支持,并且也找到了这个文档。
我正在使用以下代码从周和年创建周期列
dff['Period'] = pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w").dt.month
dff['Period'] ='Period ' + dff['Period'].astype(str).str.zfill(2)
然而,我的公司在这段时间内使用5-4-4格式,即
- 前5周[Wk01至Wk05]应视为第1期
- 接下来的4周[Wk06至Wk09]应视为第2阶段
- 接下来的4周[Wk10至Wk13]应视为第3期
再次
- 接下来的5周[Wk14至Wk18]应视为第4期
- 接下来的4周[Wk19至Wk22]应视为第5阶段
- 接下来的4周[Wk23至Wk26]应视为第6期
依此类推。。。。。。。
电流输出
Year Period Week Week ID Site
2020 Period 01 Wk01 202001 ABC
2020 Period 01 Wk02 202002 ABC
2020 Period 01 Wk03 202003 ABC
2020 Period 01 Wk04 202004 ABC
2020 Period 02 Wk05 202005 ABC
2020 Period 02 Wk06 202006 ABC
2020 Period 02 Wk07 202007 ABC
2020 Period 02 Wk08 202008 ABC
2020 Period 03 Wk09 202009 ABC
2020 Period 03 Wk10 202010 ABC
2020 Period 03 Wk11 202011 ABC
2020 Period 03 Wk12 202012 ABC
2020 Period 03 Wk13 202013 ABC
2020 Period 04 Wk14 202014 ABC
2020 Period 04 Wk15 202015 ABC
预期输出
Year Period Week Week ID Site
2020 Period 01 Wk01 202001 ABC
2020 Period 01 Wk02 202002 ABC
2020 Period 01 Wk03 202003 ABC
2020 Period 01 Wk04 202004 ABC
2020 Period 01 Wk05 202005 ABC
2020 Period 02 Wk06 202006 ABC
2020 Period 02 Wk07 202007 ABC
2020 Period 02 Wk08 202008 ABC
2020 Period 02 Wk09 202009 ABC
2020 Period 03 Wk10 202010 ABC
2020 Period 03 Wk11 202011 ABC
2020 Period 03 Wk12 202012 ABC
2020 Period 03 Wk13 202013 ABC
2020 Period 04 Wk14 202014 ABC
2020 Period 04 Wk15 202015 ABC
您可以使用一些数学来计算映射器:
# define week pattern
pat = np.array([5,4,4,4])
# weeks
weeks = np.arange(52)+1
# get divider/remainder of full period (=divmod 17 here)
d,r = np.divmod((weeks-1), pat.sum())
# determine periods
a = (r==pat.cumsum()[:,None]).cumsum(axis=1).sum(axis=0)+d+1
# define mapper
week_to_period = dict(enumerate(a, start=1))
# {1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2, 9: 2, 10: 3, 11: 3,...}
# map week numbers to custom periods
s = (pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w")
.dt.isocalendar().week()
)
dff['Period'] = 'Period '+s.map(week_to_period).astype(str).str.zfill(2)
输出:
Year Period Week Week ID Site
0 2020 Period 01 Wk01 202001 ABC
1 2020 Period 01 Wk02 202002 ABC
2 2020 Period 01 Wk03 202003 ABC
3 2020 Period 01 Wk04 202004 ABC
4 2020 Period 01 Wk05 202005 ABC
5 2020 Period 02 Wk06 202006 ABC
6 2020 Period 02 Wk07 202007 ABC
7 2020 Period 02 Wk08 202008 ABC
8 2020 Period 02 Wk09 202009 ABC
9 2020 Period 03 Wk10 202010 ABC
10 2020 Period 03 Wk11 202011 ABC
11 2020 Period 03 Wk12 202012 ABC
12 2020 Period 03 Wk13 202013 ABC
13 2020 Period 04 Wk14 202014 ABC
14 2020 Period 04 Wk15 202015 ABC
import numpy as np
import pandas as pd
# Define range of months
x = list(range(1,13))
# Define week pattern
y = [5, 4, 4]*4
period1 = list(np.repeat(x, y))
wk = ['Wk'+str(i) for i in range(1,53)]
di_wk = dict(zip(wk, period1))
'''{'Wk1': 1, 'Wk2': 1, 'Wk3': 1, 'Wk4': 1, 'Wk5': 1, 'Wk6': 2, 'Wk7': 2, 'Wk8': 2,
'Wk9': 2, 'Wk10': 3, 'Wk11': 3, 'Wk12': 3, 'Wk13': 3, 'Wk14': 4, 'Wk15': 4,......}'''
dff['Period'] = 'Period ' + dff['Week'].map(di_wk).astype(str).str.zfill(2)