我有一个数据框架,其中包含两个不同日期的9个观测值。如:
df <- data.frame(date1 = c("2018-11-01", "2018-10-28", "2019-01-22", "2019-03-22", "2018-10-03", "2018-09- 10","2020-07-01", "2018-03-02", "2018-11-09"),
date2 = c("2018-12-31","2018-12-31","2018-12-31","2019-12-31","2018-12-31","2018-12-31","2020-12-31","2018-12-31","2018-12-31"))
对于每一对日期,我想按月提取它们之间的序列,并将其写入新的数据帧中。我只使用一对观测值:seq(month(date1), month(date2))
这很好,但对于date1和dat2是一个向量>1. 我尝试了像rowwise这样的命令,或者尝试遍历原始数据帧,但都没有成功。
我试着:
df %>%
rowwise() %>%
as.data.frame(df[i,])
或者像
for(i in 1:nrow(df)){
as.data.frame(df[i,])
i = i + 1
}
我需要的是每一对日期的每个月序列的单个数据帧,如df1, df2, df3…等等......每一个帮助或想法将非常感激。谢谢你。
既然您正在使用lubridate
和dplyr
,这里有一种使用这些和(实验性)group_split
的方法:
library(dplyr)
library(lubridate)
df |>
mutate(across(everything(), ymd)) |>
group_by(date1, date2) |>
mutate(new = list(seq(month(date1), month(date2)))) |>
unnest_longer(new) |>
group_split(.keep = FALSE)
输出:
[[1]]
# A tibble: 10 × 1
new
<int>
1 3
2 4
3 5
4 6
5 7
6 8
7 9
8 10
9 11
10 12
[[2]]
# A tibble: 4 × 1
new
<int>
1 9
2 10
3 11
4 12
[[3]]
# A tibble: 3 × 1
new
<int>
1 10
2 11
3 12
[[4]]
# A tibble: 3 × 1
new
<int>
1 10
2 11
3 12
[[5]]
# A tibble: 2 × 1
new
<int>
1 11
2 12
[[6]]
# A tibble: 2 × 1
new
<int>
1 11
2 12
[[7]]
# A tibble: 12 × 1
new
<int>
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
[[8]]
# A tibble: 10 × 1
new
<int>
1 3
2 4
3 5
4 6
5 7
6 8
7 9
8 10
9 11
10 12
[[9]]
# A tibble: 6 × 1
new
<int>
1 7
2 8
3 9
4 10
5 11
6 12
更新虽然我建议将数据帧保存在列表中,但您可以使用list2env
:
df |>
mutate(across(everything(), ymd)) |>
group_by(date1, date2) |>
mutate(new = list(seq(month(date1), month(date2)))) |>
unnest_longer(new) |>
group_split(.keep = FALSE) -> listdf
names(listdf) <- paste0("monthdf", seq(length(listdf)))
list2env(listdf, .GlobalEnv)
您可以使用purrr::pmap
而不是rowwise
来遍历每一行:
df %>%
mutate(across(.fns = as.Date)) %>%
pmap(~ as.Date(..1:..2))
这将返回一个列表,因为每个序列有不同的长度。如果它们产生相同数量的日期,那么您可以使用pmap_dfr
或pmap_dfc
创建一个数据框。
结果:
[[1]]
[1] "2018-11-01" "2018-11-02" "2018-11-03" "2018-11-04" "2018-11-05" ...
[[2]]
[1] "2018-10-28" "2018-10-29" "2018-10-30" "2018-10-31" "2018-11-01" ...
[[3]]
[1] "2019-01-22" "2019-01-21" "2019-01-20" "2019-01-19" "2019-01-18" ...
[[4]]
[1] "2019-03-22" "2019-03-23" "2019-03-24" "2019-03-25" "2019-03-26" ...
[[5]]
[1] "2018-10-03" "2018-10-04" "2018-10-05" "2018-10-06" "2018-10-07" ...
[[6]]
[1] "2018-09-10" "2018-09-11" "2018-09-12" "2018-09-13" "2018-09-14" ...
[[7]]
[1] "2020-07-01" "2020-07-02" "2020-07-03" "2020-07-04" "2020-07-05" ...
[[8]]
[1] "2018-03-02" "2018-03-03" "2018-03-04" "2018-03-05" "2018-03-06" ...
[[9]]
[1] "2018-11-09" "2018-11-10" "2018-11-11" "2018-11-12" "2018-11-13" ...