根据R中的时间间隔对数据进行分组并分配组id



我正试图弄清楚如何根据R中的时间间隔分配组id。

更多上下文:我将GPS数据(纬度/经度数据点,以不规则间隔记录(与加速度数据(82个数据点的ACC"突发",在每分钟开始时记录-一个突发中的所有82个数据点将具有相同的时间戳(合并。

由于GPS点和ACC突发是同时收集的,我现在想将GPS点与相关的ACC突发分组:分配在同一分钟内出现的所有GPS和ACC数据,一个唯一的组id。

编辑:这是一些样本数据。我想在同一分钟内将第8行中的GPS点与ACC数据分组(在这种情况下,在GPS点上方(。

structure(list(X.1 = 1:11, timestamp = c("2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:00Z", "2019-01-26T16:25:47Z", "2019-01-26T16:26:00Z", "2019-01-26T16:26:00Z", "2019-01-26T16:26:00Z"), sensor.type = c("acceleration", "acceleration", "acceleration", "acceleration", "acceleration", "acceleration", "acceleration", "gps", "acceleration", "acceleration", "acceleration"), location.long = c(NA, NA, NA, NA, NA, NA, NA, 44.4777343, NA, NA, NA), location.lat = c(NA, NA, NA, NA, NA, NA, NA, -12.2839707, NA, NA, NA), annotation = c("Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing", "Moving/Climbing"), X = c(2219L, 1694L, 1976L, 1744L, 2014L, 2202L, 2269L, NA, 1874L, 2024L, 1990L), Y = c(1416L, 1581L, 1524L, 1620L, 1409L, 1545L, 1771L, NA, 1687L, 1773L, 1813L), Z = c(2189L, 2209L, 2121L, 2278L, 2003L, 2034L, 2060L, NA, 2431L, 2504L, 2428L)), class = "data.frame", row.names = c(NA, -11L))
X.1            timestamp    sensor.type     location.long   location.lat annotation   X    Y    Z
1    1 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 2219 1416 2189        
2    2 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 1694 1581 2209       
3    3 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 1976 1524 2121       
4    4 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 1744 1620 2278       
5    5 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 2014 1409 2003        
6    6 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 2202 1545 2034       
7    7 2019-01-26T16:25:00Z acceleration            NA           NA Moving/Climbing 2269 1771 2060       
8    8 2019-01-26T16:25:47Z gps               44.47773    -12.28397 Moving/Climbing   NA   NA   NA
9    9 2019-01-26T16:26:00Z acceleration            NA           NA Moving/Climbing 1874 1687 2431        
10  10 2019-01-26T16:26:00Z acceleration            NA           NA Moving/Climbing 2024 1773 2504       
11  11 2019-01-26T16:26:00Z acceleration            NA           NA Moving/Climbing 1990 1813 2428        


这有道理吗?我知道lubridate可以根据时间间隔汇总数据,但我如何根据时间戳添加新的组id(变量(?

这里有一个使用dplyrlubridate的解决方案。我们将您的timestamp列转换为适当的日期时间类,添加一个新列,四舍五入到最近的分钟,然后根据四舍五舍五入的时间戳创建一个ID:

library(dplyr)
library(lubridate)
dat %>% 
mutate(
timestamp = ymd_hms(timestamp),
minute = floor_date(timestamp, unit = "minute"),
group_id = as.integer(factor(minute))
)

#    X.1           timestamp  sensor.type location.long location.lat      annotation    X    Y    Z
# 1    1 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 2219 1416 2189
# 2    2 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 1694 1581 2209
# 3    3 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 1976 1524 2121
# 4    4 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 1744 1620 2278
# 5    5 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 2014 1409 2003
# 6    6 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 2202 1545 2034
# 7    7 2019-01-26 16:25:00 acceleration            NA           NA Moving/Climbing 2269 1771 2060
# 8    8 2019-01-26 16:25:47          gps      44.47773    -12.28397 Moving/Climbing   NA   NA   NA
# 9    9 2019-01-26 16:26:00 acceleration            NA           NA Moving/Climbing 1874 1687 2431
# 10  10 2019-01-26 16:26:00 acceleration            NA           NA Moving/Climbing 2024 1773 2504
# 11  11 2019-01-26 16:26:00 acceleration            NA           NA Moving/Climbing 1990 1813 2428
#                 minute group_id
# 1  2019-01-26 16:25:00        1
# 2  2019-01-26 16:25:00        1
# 3  2019-01-26 16:25:00        1
# 4  2019-01-26 16:25:00        1
# 5  2019-01-26 16:25:00        1
# 6  2019-01-26 16:25:00        1
# 7  2019-01-26 16:25:00        1
# 8  2019-01-26 16:25:00        1
# 9  2019-01-26 16:26:00        2
# 10 2019-01-26 16:26:00        2
# 11 2019-01-26 16:26:00        2

最新更新