R中日期之间的条件子集



我正在将降水同位素值与降水事件的日期进行匹配。样本采集是在7-10天的基础上进行的,我想知道一天的降水量有多少样本我的目标是创建一个包含日期、降水量和同位素值的新数据框架
以下是一些示例数据。数据框架展示了我从几个存储库中收集到的内容的结构。

# example dates over three week period
start <- as.Date('2017/01/01')
len <- 21
dates <- seq(start, by = "day", length.out = len)
# example precip events in total mm accumulation 
prcp <- c(0, 1.0, 2.0, 0, 1.0, 0, 0,  # week 1
0, 0, 0, 0, 0, 1.75, 2.0,   # week 2
0, 0, 0, 0, 0, 0, 0)        # week 3
# sample measurements (numeric)
samp <- c(NA, NA, NA, NA, -15.0, NA, NA,
NA, NA, NA, NA, NA, -12.0, NA,
NA, NA, NA, NA, NA, -20, NA) 
# df showing dates, the recorded precip, and the sample measurements
# notice that sample values are associated with collection date
raw <- data.frame(dates, prcp, samp)

在本例中,有三个样本测量值。

  1. 第一个(-15(对应于第一周的三天降水,应丢弃。

  2. 第二个样本值(-12(对应于一个记录的降水日,应予以保留。样本于2017-01-13采集,2017-01-13雨水落入采集器。样本采集通常在下午晚些时候,所以我认为它们捕捉到了当天的降水量。

  3. 第三个样本(-20(对应于2017-01-14发生的降水。它是在2017-01-20收集的,在2017-01-13(样本#2(和2017-01-20(样本#3(之间没有其他降雨事件。它也应该保留。

我正在生成的新数据帧看起来像下面的示例。

# dates when a single precip day occurs between sample collection dates
dates_out <- c('2017-01-13', '2017-01-14')
# example precip events in total mm accumulation 
prcp_out <- c(1.75, 2.0)
# sample measurements (numeric)
samp_out <- c( -12.0, -20) 
# df showing dates, the recorded precip, and the sample measurements
final <- data.frame(dates_out, prcp_out, samp_out)

感谢您对我的方法或替代方法和建议的任何帮助!

如果我正确理解你(请参阅我的评论(,这里有一个选项:

library(dplyr)
library(lubridate)
raw %>%
mutate(week = week(dates)) %>%
group_by(week) %>%
filter(sum(prcp > 0) == 1) %>%
fill(samp, .direction = "downup") %>%
slice_max(prcp) %>%
ungroup()
## A tibble: 1 x 4
#  dates       prcp  samp  week
#  <date>     <dbl> <dbl> <dbl>
#1 2017-01-09  1.75   -12     2

说明:确定每个dates的周数;一周一组,只在有一天降水的那几周。将samp中的所有NA替换为收集降水时的条目。保持每周有非零降水量的(单个(行;然后取消分组。

如果您不需要样本ID,则可以跳过fill步骤。如果不希望使用week列,请在末尾使用select(-week)删除。

library(dplyr)
key = c(rep(0,nrow(raw)))
period = c(rep(0,nrow(raw)))
i=2

for(i in 1:(nrow(raw)-1)){
if(!is.na(raw$samp[i])){key[i+1]  = key[i]+1   ### label each sample event 
}else{key[i+1]= key[i]}                        ### numbering after the sample 
### to group cumulatively into the sample. 
}
raw2 = raw %>%  mutate(key = key) %>% 
group_by(key) %>%
mutate(count = sum(prcp>0)) %>% ungroup  ### count precip events per sample 
Clean = raw2 %>%  filter(count<2) %>% group_by(key) %>%      # filter samples with more than 2 precip events  
summarize(dates = dates[which.max(prcp)], prcp = max(prcp), samp = samp[!is.na(samp)]) #gather data to one row date by precip event

最新更新