我想根据另一个数据帧的日期间隔自动创建多个数据帧。假设我有这个例子:
df <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01",
"2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03",
"2022-01-04", "2022-01-04",
"2022-01-05", "2022-01-05", "2022-01-05")),
Name = c(LETTERS[1:11]),
Value = c(1:11))
我的目标是创建 3 个新数据帧。df1
应包含从2022-01-01
到2022-01-04
的数据,df2
应包含从2022-01-02
到2022-01-05
的数据,df3
应包含从2022-01-03
到2022-01-06
的数据。这样,这就是所需的输出,所有对象都作为数据帧:
df1 <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01",
"2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03")),
Name = c(LETTERS[1:6]),
Value = c(1:6))
df2 <- data.frame(Date = as.Date(c("2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03",
"2022-01-04", "2022-01-04")),
Name = c(LETTERS[3:8]),
Value = c(3:8))
df3 <- data.frame(Date = as.Date(c("2022-01-03",
"2022-01-04", "2022-01-04",
"2022-01-05", "2022-01-05", "2022-01-05")),
Name = c(LETTERS[6:11]),
Value = c(6:11))
请注意,每个日期的观测值数不同。我的实际数据帧比示例大得多,并且每天都会不断增加,因此我需要使此过程自动进行。有什么建议吗?
这是一个替代方案:
dates <- seq(df$Date[1], df$Date[1]+3, by = "day")
dates
# [1] "2022-01-01" "2022-01-02" "2022-01-03" "2022-01-04"
Map(function(a, b) dplyr::filter(df, between(Date, a, b)), dates, dates + 3)
# [[1]]
# Date Name Value
# 1 2022-01-01 A 1
# 2 2022-01-01 B 2
# 3 2022-01-02 C 3
# 4 2022-01-02 D 4
# 5 2022-01-02 E 5
# 6 2022-01-03 F 6
# 7 2022-01-04 G 7
# 8 2022-01-04 H 8
# [[2]]
# Date Name Value
# 1 2022-01-02 C 3
# 2 2022-01-02 D 4
# 3 2022-01-02 E 5
# 4 2022-01-03 F 6
# 5 2022-01-04 G 7
# 6 2022-01-04 H 8
# 7 2022-01-05 I 9
# 8 2022-01-05 J 10
# 9 2022-01-05 K 11
# [[3]]
# Date Name Value
# 1 2022-01-03 F 6
# 2 2022-01-04 G 7
# 3 2022-01-04 H 8
# 4 2022-01-05 I 9
# 5 2022-01-05 J 10
# 6 2022-01-05 K 11
# [[4]]
# Date Name Value
# 1 2022-01-04 G 7
# 2 2022-01-04 H 8
# 3 2022-01-05 I 9
# 4 2022-01-05 J 10
# 5 2022-01-05 K 11
当然,这变成了四个而不是三个,但这可以通过分配给dates
来轻松控制。
这将生成list
帧,而不是三个独立的帧。我想你会发现,当你有多个结构相同的(列名/意图)框架时,最好将它们保存在一个列表中,这样当你打算对它们中的每一个做一些事情时,你可以很容易地使用lapply
。有关此内容的更多讨论,请参阅 https://stackoverflow.com/a/24376207/3358227。