我有一个类似于以下的tibble
:
data<-tibble(ref=c("ABC", "ABC", "XYZ", "XYZ", "FGH", "FGH", "FGH"),
type=c("A", "B", "A", "A", "A", "A", "B"))
ref type
1 ABC A
2 ABC B
3 XYZ A
4 XYZ A
5 FGH A
6 FGH A
7 FGH B
我需要按ref
分组,如果-在一个组中-type
B存在,返回该行,否则默认返回type
a的任何行(但只有1行)
预期输出:
ref type
1 ABC B
2 XYZ A
3 FGH B
对于大量的数据,最好在分组前进行排序
tidyverse
library(tidyverse)
df<-tibble(ref=c("ABC", "ABC", "XYZ", "XYZ", "FGH", "FGH", "FGH"),
type=c("A", "B", "A", "A", "A", "A", "B"))
distinct(df) %>%
arrange(ref, desc(type)) %>%
group_by(ref) %>%
slice_head(n = 1) %>%
ungroup()
#> # A tibble: 3 × 2
#> ref type
#> <chr> <chr>
#> 1 ABC B
#> 2 FGH B
#> 3 XYZ A
data.table
由reprex包(v2.0.1)于2022-04-27创建
df<-data.frame(ref=c("ABC", "ABC", "XYZ", "XYZ", "FGH", "FGH", "FGH"),
type=c("A", "B", "A", "A", "A", "A", "B"))
library(data.table)
setDT(df)[order(ref, -type), .SD[1], by = ref]
#> ref type
#> 1: ABC B
#> 2: FGH B
#> 3: XYZ A
由reprex包(v2.0.1)于2022-04-27创建
如果您只有A
和B
,那么您可以安排并简单地获得第一行,即
library(dplyr)
data %>%
group_by(ref) %>%
filter(type %in% c('A', 'B')) %>% #If other types exist
arrange(desc(type)) %>%
slice(1L)
# A tibble: 3 x 2
# Groups: ref [3]
ref type
<chr> <chr>
1 ABC B
2 FGH B
3 XYZ A
我们可以使用which.max
over boolean来提取所需的行
data %>%
group_by(ref) %>%
slice(which.max(type == "B")) %>%
ungroup()
,
# A tibble: 3 x 2
ref type
<chr> <chr>
1 ABC B
2 FGH B
3 XYZ A