在Pandas中,给出一个日期时间索引,其中包含所有工作日的行,如何确定一行是周初还是周末



我有一组股票信息,日期时间设置为指数,股市只在工作日开放,所以我所有的行都是工作日,这很好,我想确定一行是本周的开始还是本周的结束,由于假期的原因,这可能不总是在周一/周五。一个更好的想法是确定数据帧中是否有下一天/前一天的行条目(因为我的数据保证只存在于工作日(,但我不知道如何计算。以下是我的数据示例:

date    day_of_week day_of_month    day_of_year month_of_year
5/1/2017    0   1   121 5
5/2/2017    1   2   122 5
5/3/2017    2   3   123 5
5/4/2017    3   4   124 5
5/8/2017    0   8   128 5
5/9/2017    1   9   129 5
5/10/2017   2   10  130 5
5/11/2017   3   11  131 5
5/12/2017   4   12  132 5
5/15/2017   0   15  135 5
5/16/2017   1   16  136 5
5/17/2017   2   17  137 5
5/18/2017   3   18  138 5
5/19/2017   4   19  139 5
5/23/2017   1   23  143 5
5/24/2017   2   24  144 5
5/25/2017   3   25  145 5
5/26/2017   4   26  146 5
5/30/2017   1   30  150 5

这是我当前的代码

# Date fields
def DateFields(df_input):
dates = df_input.index.to_series()  
df_input['day_of_week'] = dates.dt.dayofweek
df_input['day_of_month'] = dates.dt.day
df_input['day_of_year'] = dates.dt.dayofyear
df_input['month_of_year'] = dates.dt.month
df_input['isWeekStart'] = "No" #<--- Need help here
df_input['isWeekEnd'] = "No" #<--- Need help here
df_input['date'] = dates.dt.strftime('%Y-%m-%d')
return df_input

如何计算一行是周初还是周末?

我正在寻找的示例:

date    day_of_week day_of_month    day_of_year month_of_year isWeekStart isWeekEnd
5/1/2017    0   1   121 5   1   0
5/2/2017    1   2   122 5   0   0
5/3/2017    2   3   123 5   0   0
5/4/2017    3   4   124 5   0   1 # short week, Thursday is last work day
5/8/2017    0   8   128 5   1   0
5/9/2017    1   9   129 5   0   0
5/10/2017   2   10  130 5   0   0
5/11/2017   3   11  131 5   0   0
5/12/2017   4   12  132 5   0   1
5/15/2017   0   15  135 5   1   0
5/16/2017   1   16  136 5   0   0
5/17/2017   2   17  137 5   0   0
5/18/2017   3   18  138 5   0   0
5/19/2017   4   19  139 5   0   1
5/23/2017   1   23  143 5   1   0 # short week, Tuesday is first work day
5/24/2017   2   24  144 5   0   0
5/25/2017   3   25  145 5   0   0
5/26/2017   4   26  146 5   0   1
5/30/2017   1   30  150 5   1   0

编辑:我忘了有些假期是在周中,在这种情况下,如果它能把这些当作一个单独的"周";前后相应标记。尽管如果它还不够聪明,那么度过一个漫长的周末将是一个良好的开端。

以下是BusinessDay:的一个想法

prev_working_day = df['date'] - pd.tseries.offsets.BusinessDay(1)
df['isFirstWeekDay'] = (df['date'].dt.isocalendar().week != 
prev_working_day.dt.isocalendar().week)

上一个工作日也是如此。请注意,默认的假日日历是"美国"。看看这篇不同的文章。

输出:

date  day_of_week  day_of_month  day_of_year  month_of_year  isFirstWeekDay
0  2017-05-01            0             1          121              5            True
1  2017-05-02            1             2          122              5           False
2  2017-05-03            2             3          123              5           False
3  2017-05-04            3             4          124              5           False
4  2017-05-08            0             8          128              5            True
5  2017-05-09            1             9          129              5           False
6  2017-05-10            2            10          130              5           False
7  2017-05-11            3            11          131              5           False
8  2017-05-12            4            12          132              5           False
9  2017-05-15            0            15          135              5            True
10 2017-05-16            1            16          136              5           False
11 2017-05-17            2            17          137              5           False
12 2017-05-18            3            18          138              5           False
13 2017-05-19            4            19          139              5           False
14 2017-05-23            1            23          143              5           False
15 2017-05-24            2            24          144              5           False
16 2017-05-25            3            25          145              5           False
17 2017-05-26            4            26          146              5           False
18 2017-05-30            1            30          150              5           False

这里有一种使用每周分组的方法。

df['date'] = pd.to_datetime(df['date'])
business_days = df.assign(date_copy = df['date']).groupby(pd.Grouper(key='date_copy', freq='W'))['date'].apply(list).to_frame()
business_days['isWeekStart'] = business_days['date'].apply(lambda x: [1 if i == min(x) else 0 for i in x])
business_days['isWeekEnd'] = business_days['date'].apply(lambda x: [1 if i == max(x) else 0 for i in x])
business_days = business_days.apply(pd.Series.explode)
pd.merge(df, business_days, left_on='date', right_on='date')

输出:

date  day_of_week day_of_month    day_of_year month_of_year   isWeekStart isWeekEnd
0   2017-05-01            0            1            121             5             1         0
1   2017-05-02            1            2            122             5             0         0
2   2017-05-03            2            3            123             5             0         0
3   2017-05-04            3            4            124             5             0         1
4   2017-05-08            0            8            128             5             1         0
5   2017-05-09            1            9            129             5             0         0
6   2017-05-10            2           10            130             5             0         0
7   2017-05-11            3           11            131             5             0         0
8   2017-05-12            4           12            132             5             0         1
9   2017-05-15            0           15            135             5             1         0
10  2017-05-16            1           16            136             5             0         0
11  2017-05-17            2           17            137             5             0         0
12  2017-05-18            3           18            138             5             0         0
13  2017-05-19            4           19            139             5             0         1
14  2017-05-23            1           23            143             5             1         0
15  2017-05-24            2           24            144             5             0         0
16  2017-05-25            3           25            145             5             0         0
17  2017-05-26            4           26            146             5             0         1
18  2017-05-30            1           30            150             5             1         1

请注意,2017-05-30同时标记为WeekStart和WeekEnd,因为它是该周的唯一日期。

最新更新