在下面的df中,有id和日期。我想为每个id计算每个月的平均最长日期减去最短日期。
df
structure(list(user_id = c("6897bea62278", "6897bea62278", "13d51bc5b108",
"8012f20570b5", "5bc61ba43a08", "13d51bc5b108", "13d51bc5b108",
"6897bea62278", "8012f20570b5", "13d51bc5b108", "13d51bc5b108",
"6897bea62278", "13d51bc5b108", "13d51bc5b108", "8012f20570b5"
), date = structure(c(18687, 18687, 18687, 18687, 18687, 18687,
18687, 18687, 18687, 18687, 18687, 18687, 18687, 18687, 18687
), class = "Date")), row.names = c(2L, 4L, 8L, 16L, 18L, 20L,
23L, 27L, 39L, 40L, 41L, 43L, 45L, 51L, 55L), class = "data.frame")
期望输出
| Month | Avg days |
| -------- | -------------- |
| March | 23 |
| April | 15 |
很难用一天的数据进行测试,但这就是你想要的吗?
df %>%
mutate(month = months(date)) %>% #get month
group_by(user_id, month) %>%
mutate(xx1 = max(date)-min(date)) %>% #by user and month find range
group_by(month) %>%
summarise(Avg_Days = mean(xx1)) #average ranges by month