如何根据时间戳删除数据



我有一个数据帧,看起来像这样:

head(data1)
# A tibble: 6 × 10
Date       Time     Axis1 Axis2 Axis3    VM Standing Stepping Cycling
<date>     <chr>    <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>   <dbl>
1 2022-03-17 11:29:00     0     0     0     0        0        0       0
2 2022-03-17 11:29:00     0     0     0     0        0        0       0
3 2022-03-17 11:29:00     0     0     0     0        0        0       0
4 2022-03-17 11:29:00     0     0     0     0        0        0       0
5 2022-03-17 11:29:00     0     0     0     0        0        0       0
6 2022-03-17 11:29:00     0     0     0     0        0        0       0
# … with 1 more variable: New_Sitting <dbl>

它每天每秒都有一个数据点,持续一周。是否有可能在工作日之外删除数据点?(比如只保留工作日早上7点到下午5点之间的数据)

您可以使用weekdays()data.table::between()

res <- dat[!weekdays(dat$time) %in% c("Saturday", "Sunday") &
data.table::between(as.integer(gsub('\D', '', substring(dat$time, 12))), 7e4, 17e4), ]

names(table(substr(res$time, 12, 13)))
# [1] "07" "08" "09" "10" "11" "12" "13" "14" "15" "16" "17"
names(table(weekdays(res$time)))
# [1] "Friday"    "Monday"    "Thursday"  "Tuesday"  
# [5] "Wednesday"
数据

dat <- data.frame(time=seq.POSIXt(as.POSIXct('2022-01-01'), as.POSIXct('2022-01-14'), 'hour'))
dat$X <- rnorm(nrow(dat))

使用如下:请注意,我们首先创建一个datetime对象,然后进行筛选,使工作日在1:5之间,时间在07(上午7点)和17(下午5点)之间

library(tidyverse)
library(lubridate)
df %>%
mutate(date_time = ymd_hms(paste(Date, Time)))%>%
filter(format(date_time, "%u")%in%1:5, 
date_time>=ymd_h(paste(Date, "07")),
date_time <= ymd_h(paste(Date, "17"))) %>%
select(-date_time)

最新更新