使用切割.日期和开始星期六在R



我目前正以4周的时间间隔聚合数据,但我需要从周六开始这几周。我在下面的可复制示例中使用下面的cut.Date行来创建一个列,指示每个4周的窗口,但我很难找到一个资源(在dplyrlubridate中(,以便从周六开始以正好4周的间隔进行聚合。

这是我所拥有的:

library(tidyverse)
# Example
example <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016
), 
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059, 
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068, 
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077, 
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086, 
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"), 
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L), 
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", 
"Sat"), class = c("ordered", "factor"))), 
row.names = c(NA, -40L), 
class = c("tbl_df", "tbl", "data.frame"))

# Using this in a mutate to generate the week
example %>% 
mutate(wk = cut.Date(date, breaks = "4 week", labels = FALSE)) %>%
print()

我希望它看起来如下:

# Ideal output
output <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016), 
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059, 
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068, 
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077, 
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086, 
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"), 
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L),  
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"), 
class = c("ordered", "factor")), 
wk = c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L)), 
row.names = c(NA, -40L),  
class = c("tbl_df", "tbl", "data.frame"))

同样,第一个4周的时间段是从2016-09-10到2016-10-08,当我使用周六时,这是我需要的。如果您有任何问题,请联系我,并提前感谢!

我们将使用整数除法(%/%(来确定";今天的日期";以及数据中的第一个CCD_ 5。

在前面,你的output都在周六开始和结束,这似乎并不一致。我假设你想在周六开始每个4周的周期,所以它将在周五结束。

基数R

example$wk <- with(example, as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L)
print(example, n = 40)
# # A tibble: 40 x 4
#       yr date       day_of_week    wk
#    <dbl> <date>     <ord>       <dbl>
#  1  2016 2016-09-10 Sat             1
#  2  2016 2016-09-11 Sun             1
#  3  2016 2016-09-12 Mon             1
#  4  2016 2016-09-13 Tue             1
#  5  2016 2016-09-14 Wed             1
#  6  2016 2016-09-15 Thu             1
#  7  2016 2016-09-16 Fri             1
#  8  2016 2016-09-17 Sat             1
#  9  2016 2016-09-18 Sun             1
# 10  2016 2016-09-19 Mon             1
# 11  2016 2016-09-20 Tue             1
# 12  2016 2016-09-21 Wed             1
# 13  2016 2016-09-22 Thu             1
# 14  2016 2016-09-23 Fri             1
# 15  2016 2016-09-24 Sat             1
# 16  2016 2016-09-25 Sun             1
# 17  2016 2016-09-26 Mon             1
# 18  2016 2016-09-27 Tue             1
# 19  2016 2016-09-28 Wed             1
# 20  2016 2016-09-29 Thu             1
# 21  2016 2016-09-30 Fri             1
# 22  2016 2016-10-01 Sat             1
# 23  2016 2016-10-02 Sun             1
# 24  2016 2016-10-03 Mon             1
# 25  2016 2016-10-04 Tue             1
# 26  2016 2016-10-05 Wed             1
# 27  2016 2016-10-06 Thu             1
# 28  2016 2016-10-07 Fri             1
# 29  2016 2016-10-08 Sat             2
# 30  2016 2016-10-09 Sun             2
# 31  2016 2016-10-10 Mon             2
# 32  2016 2016-10-11 Tue             2
# 33  2016 2016-10-12 Wed             2
# 34  2016 2016-10-13 Thu             2
# 35  2016 2016-10-14 Fri             2
# 36  2016 2016-10-15 Sat             2
# 37  2016 2016-10-16 Sun             2
# 38  2016 2016-10-17 Mon             2
# 39  2016 2016-10-18 Tue             2
# 40  2016 2016-10-19 Wed             2

dplyr

library(dplyr)
example %>%
mutate(
wk = as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L
) %>%
print(n=99)
# # A tibble: 40 x 4
#       yr date       day_of_week    wk
#    <dbl> <date>     <ord>       <dbl>
#  1  2016 2016-09-10 Sat             1
#  2  2016 2016-09-11 Sun             1
#  3  2016 2016-09-12 Mon             1
#  4  2016 2016-09-13 Tue             1
#  5  2016 2016-09-14 Wed             1
#  6  2016 2016-09-15 Thu             1
#  7  2016 2016-09-16 Fri             1
#  8  2016 2016-09-17 Sat             1
#  9  2016 2016-09-18 Sun             1
# 10  2016 2016-09-19 Mon             1
# 11  2016 2016-09-20 Tue             1
# 12  2016 2016-09-21 Wed             1
# 13  2016 2016-09-22 Thu             1
# 14  2016 2016-09-23 Fri             1
# 15  2016 2016-09-24 Sat             1
# 16  2016 2016-09-25 Sun             1
# 17  2016 2016-09-26 Mon             1
# 18  2016 2016-09-27 Tue             1
# 19  2016 2016-09-28 Wed             1
# 20  2016 2016-09-29 Thu             1
# 21  2016 2016-09-30 Fri             1
# 22  2016 2016-10-01 Sat             1
# 23  2016 2016-10-02 Sun             1
# 24  2016 2016-10-03 Mon             1
# 25  2016 2016-10-04 Tue             1
# 26  2016 2016-10-05 Wed             1
# 27  2016 2016-10-06 Thu             1
# 28  2016 2016-10-07 Fri             1
# 29  2016 2016-10-08 Sat             2
# 30  2016 2016-10-09 Sun             2
# 31  2016 2016-10-10 Mon             2
# 32  2016 2016-10-11 Tue             2
# 33  2016 2016-10-12 Wed             2
# 34  2016 2016-10-13 Thu             2
# 35  2016 2016-10-14 Fri             2
# 36  2016 2016-10-15 Sat             2
# 37  2016 2016-10-16 Sun             2
# 38  2016 2016-10-17 Mon             2
# 39  2016 2016-10-18 Tue             2
# 40  2016 2016-10-19 Wed             2

最新更新