r语言 - 如何使用dplyr检测字符串中字符的位置



我有以下数据框架:

library(tidyverse)
df <- tribble(
~sname, ~seq,
"foo", "TTAW",
"bar", "ACTN",
"qux", "AAAA"
)

df

我想要的是第三列定位"T"每一行。最终结果如下所示:

sname seq  t_pos
foo   TTAW 1,2
bar   ACTN 3
qux   AAAA 0

我怎么才能做到呢?我试过这个,但不工作:

df %>% 
dplyr::mutate(t_pos = paste0( list(which(strsplit(seq, "")[[1]] == "T") ) ))

正确的做法是什么?

您可以将rowwise添加到您的尝试中:

library(dplyr)
df %>% 
rowwise() %>%
mutate(t_pos = toString(which(strsplit(seq, "")[[1]] == "T")))
#  sname seq   t_pos 
#  <chr> <chr> <chr> 
#1 foo   TTAW  "1, 2"
#2 bar   ACTN  "3"   
#3 qux   AAAA  ""    

Withmap_chr:

df %>% 
mutate(t_pos = purrr::map_chr(strsplit(seq, ""), ~toString(which(.x == "T"))))

df$t_pos <- sapply(strsplit(df$seq, ''), function(x) toString(which(x == 'T')))

最新更新