我有一个具有多个日期和条件的数据集。我想提取所有以条件位置== "A"和所有起始日期的行== "7天后也是如此。例如:
Date Place Value1 Value2
2018-10-27 C 20 8
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
2018-11-8 D 18 5
And I want:
Date Place Value1 Value2
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
可以看到,它必须提取第一行(place == A)和7天后的所有行。第一天之后的地方像";a &;"没有意义,比如"B"one_answers";C"。必须以"A"开头。它跳过2018-11-8,因为它距离2018-10-29超过7天。
我尝试了这样一个问题:R:提取基于日期的数据,如果日期小于",但我不知道如何提取这7天。
我们可以使用match
获取相应的Date
值,并从中选择7天内的所有行。
library(dplyr)
df %>%
mutate(Date = as.Date(Date)) %>%
filter({tmp <- Date[match('A', Place)]
between(Date, tmp, tmp + 7)})
# Date Place Value Value.1
#1 2018-10-29 A 10 5
#2 2018-10-31 B 15 6
#3 2018-11-04 C 17 9
dplyr
允许在全局环境下执行不创建临时变量的操作,上述解可以在R的基础上写成:
df$Date <- as.Date(df$Date)
date_val <- df$Date[match('A', df$Place)]
subset(df, Date >= date_val & Date <= date_val + 7)
df <- structure(list(Date = structure(c(17831, 17833, 17835, 17839,
17843), class = "Date"), Place = c("C", "A", "B", "C", "D"),
Value = c(20L, 10L, 15L, 17L, 18L), Value.1 = c(8L, 5L, 6L,
9L, 5L)), row.names = c(NA, -5L), class = "data.frame")
Base R
中的一个选项是
# Find the difference in days
tmp1 <- df$Date - df[df$Place == "A", "Date"]
# Time differences in days
# [1] -2 0 2 6 10
# And then just subset your df
df[df$Place == "A" | (tmp1 <= 7 & tmp1 > 0), ]
# Date Place Value Value.1
# 2 2018-10-29 A 10 5
# 3 2018-10-31 B 15 6
# 4 2018-11-04 C 17 9
df <- read.table( text = "Date Place Value Value
2018-10-27 C 20 8
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
2018-11-8 D 18 5 ", header = T)
df[, 1] <- as.Date(df[, 1])
即使这样也可以工作,尽管几乎与Ronak的答案相似,但不需要创建tmp
变量。
#dput
dat <- structure(list(Date = c("2018-10-27", "2018-10-29", "2018-10-31",
"2018-11-04", "2018-11-08"), Place = c("C", "A", "B", "C", "D"
), Value1 = c(20L, 10L, 15L, 17L, 18L), Value2 = c(8, 5, 6, 9,
5)), class = "data.frame", row.names = c(NA, -5L))
#code
library(dplyr)
dat %>% mutate(Date = as.Date(Date)) %>%
filter(between(Date, Date[Place == "A"], Date[Place == "A"] + 7))
Date Place Value1 Value2
1 2018-10-29 A 10 5
2 2018-10-31 B 15 6
3 2018-11-04 C 17 9