我可能脑子里有个屁,但我想做的只是filter
组,products
对所有class
组都是通用的。我应该注意,根据我的真实数据,我不能硬编码filter
。这是一个filter(product %in% c("apple","lemon")
不会工作,因为我有成千上万的产品。
示例代码数据
其中apple
和lemon
对于所有class
es 是公共的
df <- data.frame(
class = c("A", "A", "A", "A", "A",
"B", "B", "B", "B", "B",
"C", "C", "C", "C", "C"),
product = c("apple", "lemon", "banana"," orange", "papaya",
"apple", "lemon", "lime", "blueberry", "raspberry",
"apple", "lemon", "mango", "strawberry", "pear")
)
df
class product
1 A apple
2 A lemon
3 A banana
4 A orange
5 A papaya
6 B apple
7 B lemon
8 B lime
9 B blueberry
10 B raspberry
11 C apple
12 C lemon
13 C mango
14 C strawberry
15 C pear
所需输出
class product
1 A apple
2 A lemon
3 B apple
4 B lemon
5 C apple
6 C lemon
如果一个组中不会有重复,看看下面的代码是否适用于
library(dplyr)
df %>% group_by(product) %>% mutate(cnt = n()) %>% group_by(class) %>%
mutate(g = group_indices()) %>% ungroup() %>%
filter(cnt >= max(g)) %>% select(-cnt)
# A tibble: 6 x 3
class product g
<chr> <chr> <int>
1 A apple 1
2 A lemon 1
3 B apple 2
4 B lemon 2
5 C apple 3
6 C lemon 3