在一个阈值内将连续的时间组合在一起,以获得r中的TimeGroup输出



我有一个时间的列或向量。数据最初以2分钟的间隔均匀分布,但我对数据进行了细分,以查看感兴趣的记录。现在,我想把所有连续间隔2分钟的时间记录放在一起。结果将是一个名为'TimeGroup'的列/向量,用于标识哪些记录被分组在一起。

DateTime<-as.POSIXct( strptime(c("2013-05-27 12:41:52 UTC", "2013-05-27 13:25:52 UTC", "2013-05-27 13:27:52 UTC",
"2013-05-27 13:29:52 UTC" ,"2013-05-27 13:31:52 UTC" ,"2013-05-27 13:33:52 UTC",
"2013-05-27 13:35:52 UTC" ,"2013-05-27 13:37:52 UTC" ,"2013-05-27 13:39:52 UTC",
"2013-05-27 13:41:52 UTC", "2013-05-27 13:43:52 UTC", "2013-05-27 13:45:52 UTC",
"2013-05-27 13:47:52 UTC" ,"2013-05-27 13:49:52 UTC" ,"2013-05-27 13:51:52 UTC",
"2013-05-27 13:53:52 UTC", "2013-05-27 13:55:52 UTC", "2013-05-27 13:57:52 UTC",
"2013-05-27 13:59:52 UTC", "2013-05-27 14:01:52 UTC" ,"2013-05-27 14:03:52 UTC",
"2013-05-27 14:05:52 UTC", "2013-05-27 14:07:52 UTC", "2013-05-27 14:41:52 UTC",
"2013-05-27 14:43:52 UTC" ,"2013-05-27 14:45:52 UTC", "2013-05-27 14:47:52 UTC",
"2013-05-27 15:11:52 UTC" ,"2013-05-27 15:13:52 UTC", "2013-05-27 15:15:52 UTC",
"2013-05-27 15:17:52 UTC" ,"2013-05-27 15:21:52 UTC", "2013-05-27 15:23:52 UTC",
"2013-05-27 15:25:52 UTC" ,"2013-05-27 15:27:52 UTC", "2013-05-27 15:29:52 UTC",
"2013-05-27 15:33:52 UTC" ,"2013-05-27 15:35:52 UTC" ,"2013-05-27 15:37:52 UTC",
"2013-05-27 15:39:52 UTC", "2013-05-27 15:41:52 UTC" ,"2013-05-27 15:43:52 UTC",
"2013-05-27 15:45:52 UTC" ,"2013-05-27 15:47:52 UTC" ,"2013-05-27 16:03:52 UTC",
"2013-05-27 16:05:52 UTC", "2013-05-27 16:07:52 UTC", "2013-05-27 16:09:52 UTC",
"2013-05-27 16:11:52 UTC", "2013-05-27 16:13:52 UTC", "2013-05-27 16:15:52 UTC",
"2013-05-27 16:17:52 UTC" ,"2013-05-27 16:19:52 UTC" ,"2013-05-27 16:21:52 UTC",
"2013-05-27 16:23:52 UTC", "2013-05-27 16:25:52 UTC" ,"2013-05-27 16:27:52 UTC",
"2013-05-27 16:29:52 UTC" ,"2013-05-27 16:31:52 UTC" ,"2013-05-27 16:33:52 UTC",
"2013-05-27 16:39:52 UTC" ,"2013-05-27 16:41:52 UTC" ,"2013-05-27 16:43:52 UTC",
"2013-05-27 16:45:52 UTC" ,"2013-05-27 16:47:52 UTC" ,"2013-05-27 16:49:52 UTC",
"2013-05-27 16:51:52 UTC" ,"2013-05-27 16:53:52 UTC" ,"2013-05-27 16:55:52 UTC",
"2013-05-27 16:57:52 UTC" ,"2013-05-27 16:59:52 UTC" ,"2013-05-27 17:01:52 UTC",
"2013-05-27 17:03:52 UTC" ,"2013-05-27 17:05:52 UTC" ,"2013-05-27 17:07:52 UTC",
"2013-05-27 17:09:52 UTC", "2013-05-27 17:11:52 UTC" ,"2013-05-27 17:13:52 UTC",
"2013-05-27 17:15:52 UTC", "2013-05-27 17:17:52 UTC" ,"2013-05-27 17:19:52 UTC"), format="%Y-%m-%d %H:%M:%S"),1970-01-01, tz="UTC")

diff(DateTime)
Time differences in mins
[1] 44  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2 34  2  2  2 24  2  2  2  4  2  2  2
[35]  2  4  2  2  2  2  2  2  2 16  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  6  2  2  2  2  2  2  2  2
[69]  2  2  2  2  2  2  2  2  2  2  2  2

从这个示例数据中,将有7个timegroup,分别称为1、2、3、4、5、6、7。不属于时间组的记录为NA。

试试下面的

library( data.table )
DT <- data.table( DateTime = DateTime )
DT[, group := c(0,cumsum( diff(DateTime) > 2)) ]

相关内容

  • 没有找到相关文章

最新更新