我希望有人能帮助我解决以下事件检测问题。输入数据是一个时间序列(常规(。它包含"时间"、"水位"one_answers"径流"。其目的是检测阈值以上的事件,并提取事件开始的时间、结束的时间、持续时间(分钟(以及事件期间的最大值/和值。根据定义,当日期发生变化时,必须剪切每个事件。相反,只有当缺失值的持续时间超过一小时时,NA才应该导致事件的切割。
library(data.table)
library(dplyr)
library(xts)
## data
dWL <- structure(list(Time = structure(c(1463951500, 1463951800, 1463952100, 1463952400, 1463952700, 1463953000, 1463953300, 1463953600, 1463953900, 1463954200, 1463954500, 1463954800, 1463955100, 1463955400, 1463955700, 1463956000),class = c("POSIXct", "POSIXt"), tzone = ""), WL = c(0.2, 2.5, 2.4, 2.1, 0.9, 2.8, 2.9, 1.9, 2.4, NA, 2.3, 2.6, 2.8, 2.1, 2.0, 1.9), Q = c(0.0, 255.5, 232.4, 150.1, 0.0, 345.8, 382.9, 0.0, 214.4, NA, 201.3, 312.6, 362.8, 80.1, 20.0, 0.0)), row.names = c(NA, -16L), class = "data.frame")
## threshold value
vth <-2
na.omit(dWL) %>% ## ??how to drop NAs only when the NA-duration is longer than an hour??
mutate(tmp_WL = WL >= vth, id = rleid(tmp_WL)) %>%
filter(tmp_WL) %>%
group_by(id) %>% ## ??how to additional seperate events during change-of-date??
summarise(start_time=first(Time),end_time=last(Time), event_duration = difftime(last(Time), first(Time)), max_Q=max(Q), sum_Q=sum(Q))
我知道包heatwaveR
具有非常有用的exceedance
功能,尽管我还没有设法使它适用于亚日常时间序列。
既然您用data.table
标记了它,让我们使用它。我们可以使用rleid()
的游程编码来跟踪事件。一旦我们有了每个人的ID,我们就可以通过进行简单的分组并进行计算。最后,我们只是通过将RLE
列设置为NULL
来删除它,并使用[]
来查看结果。
library(data.table)
setDT(dWL)[!is.na(WL),event := WL > vth][
,RLE := rleidv(event)][
event == TRUE,.(start = min(Time),
end=max(Time),
max.WL=max(WL),
duration=difftime(max(Time),min(Time)),
runoff=sum(Q)),
by=RLE][,RLE:=NULL][]
# start end max.WL duration runoff
#1: 2016-05-22 17:16:40 2016-05-22 17:26:40 2.5 10 mins 638.0
#2: 2016-05-22 17:36:40 2016-05-22 17:41:40 2.9 5 mins 728.7
#3: 2016-05-22 17:51:40 2016-05-22 17:51:40 2.4 0 mins 214.4
#4: 2016-05-22 18:01:40 2016-05-22 18:16:40 2.8 15 mins 956.8
数据
dWL <- structure(list(Time = structure(c(1463951500, 1463951800, 1463952100, 1463952400, 1463952700, 1463953000, 1463953300, 1463953600, 1463953900, 1463954200, 1463954500, 1463954800, 1463955100, 1463955400, 1463955700, 1463956000),class = c("POSIXct", "POSIXt"), tzone = ""), WL = c(0.2, 2.5, 2.4, 2.1, 0.9, 2.8, 2.9, 1.9, 2.4, NA, 2.3, 2.6, 2.8, 2.1, 2.0, 1.9), Q = c(0.0, 255.5, 232.4, 150.1, 0.0, 345.8, 382.9, 0.0, 214.4, NA, 201.3, 312.6, 362.8, 80.1, 20.0, 0.0)), row.names = c(NA, -16L), class = "data.frame")
vth <- 2