R - 按列的无序组合对 tibble 行进行分组



给定以下 tibble

tibble(sample = c(1:6),
string = c("ABC","ABC","CBA","FED","DEF","DEF"),
x = c("a","a","b","e","d","d"),
y = c("b","b","a","d","e","e"))
# A tibble: 6 × 4
sample string x     y    
<int> <chr>  <chr> <chr>
1      1 ABC    a     b    
2      2 ABC    a     b    
3      3 CBA    b     a    
4      4 FED    e     d    
5      5 DEF    d     e    
6      6 DEF    d     e  

我想按列的无序组合对行进行分组x,y,然后在x,y相对于组中的第一行反转的情况下翻转xy和反转string。所需的输出:

# A tibble: 6 × 5
sample string x     y     group
<int> <chr>  <chr> <chr> <dbl>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2
strSort <- function(x) sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")
dat %>% 
group_by(group = data.table::rleid(strSort(string))) %>% 
mutate(across(string:y, first))
# A tibble: 6 x 5
# Groups:   group [2]
sample string x     y     group
<int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2
<小时 />

以前的答案

这是一种同时使用tidyverseapply方法的方法。首先,对 x 和 y 列的行进行排序,然后group_byx 和 y,创建一个cur_group_id并在必要时进行stri_reverse

library(tidyverse)
library(stringi)
#Sort by row
dat[, c("x", "y")] <- t(apply(dat[, c("x", "y")], 1, sort))
dat %>% 
group_by(x, y) %>% 
mutate(group = cur_group_id(),
string = ifelse(str_sub(string, 1, 1) == toupper(x), string, stri_reverse(string)))
# A tibble: 6 x 5
# Groups:   x, y [2]
sample string x     y     group
<int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 DEF    d     e         2
5      5 DEF    d     e         2
6      6 DEF    d     e         2

最新更新