>假设我有一个以下格式的数据帧:
Group Setting Runtime Memory SomeOtherColumns
A X 102 105 ...
A X 107 80 ...
A Y 100 104 ...
A Y 101 82 ...
B X 10 50 ...
B X 11 51 ...
B X 8 52 ...
B Y 13 60 ...
B Y 14 61 ...
B Y 15 62 ...
C X 5 6 ...
C Y 6 7 ...
我想每Group+Setting
提取一行,即A+X
、A+Y
、B+X
、B+Y
、C+X
和C+Y
一行。提取的行应该是给定组的Runtime
值最低的行。
遵循预期结果:
Group Setting Runtime Memory SomeOtherColumns ...
A X 102 105 ...
A Y 100 104 ...
B X 8 52 ...
B Y 13 60 ...
C X 5 6 ...
C Y 6 7 ...
使用dplyr
这将是:
library(dplyr)
df %>% group_by(Group, Setting) %>% slice(which.min(Runtime))
# # A tibble: 6 x 5
# # Groups: Group, Setting [6]
# Group Setting Runtime Memory SomeOtherColumns
# <fct> <fct> <int> <int> <fct>
# 1 A X 102 105 ...
# 2 A Y 100 104 ...
# 3 B X 8 52 ...
# 4 B Y 13 60 ...
# 5 C X 5 6 ...
# 6 C Y 6 7 ...
同样,用data.table
的话来说:
library(data.table)
setDT(df)
df[, .SD[which.min(Runtime)], by = .(Group, Setting)]
或使用订单:
unique(df[order(Runtime)], by = c("Group", "Setting"))