我有数据
cmpny_name Hr_Min_Sec Price Hr_min
A 09:15:41 7610 09:15
A 09:15:42 7632 09:15
A 09:15:43 7654 09:15
A 09:16:21 7655 09:16
A 09:16:59 7854 09:16
A 09:17:32 7453 09:17
A 09:17:42 7467 09:17
A 09:17:58 7557 09:17
A 09:18:03 7567 09:18
A 09:18:58 7659 09:18
A 09:18:59 7810 09:18
在这里,我想在每个Hr_min中找到max(Hr_Min_Sec(,并且必须将结果显示为
cmpny_name Hr_Min_Sec Price Hr_min
A 09:15:43 7654 09:15
A 09:16:59 7854 09:16
A 09:17:58 7557 09:17
A 09:18:59 7810 09:18
df1[!duplicated(df1$Hr_min, fromLast = TRUE), ]
cmpny_name Hr_Min_Sec Price Hr_min
3 A 9:15:43 7654 9:15
5 A 9:16:59 7854 9:16
8 A 9:17:58 7557 9:17
11 A 9:18:59 7810 9:18
修改自 https://stackoverflow.com/a/23461294/3242130
这是一个data.table
的解决方案
library("data.table")
dt <- fread(
'cmpny_name Hr_Min_Sec Price Hr_min
A 09:15:41 7610 09:15
A 09:15:42 7632 09:15
A 09:15:43 7654 09:15
A 09:16:21 7655 09:16
A 09:16:59 7854 09:16
A 09:17:32 7453 09:17
A 09:17:42 7467 09:17
A 09:17:58 7557 09:17
A 09:18:03 7567 09:18
A 09:18:58 7659 09:18
A 09:18:59 7810 09:18')
dt[, .SD[max(Hr_Min_Sec)==Hr_Min_Sec,], by=Hr_min] # or
dt[, .SD[.N], by=Hr_min] # the last row in the group
# Hr_min cmpny_name Hr_Min_Sec Price
# 1: 09:15 A 09:15:43 7654
# 2: 09:16 A 09:16:59 7854
# 3: 09:17 A 09:17:58 7557
# 4: 09:18 A 09:18:59 7810