R -如果列值与向量项匹配,则从第二个向量取值



我有以下表格:

library( tidyverse )
data = read.table(text="gene1
gene2
gene3", , sep="t", col.names = c("Protein"))

和以下两个列表:

genes = c("gene1", "gene3")
genes_names = c("name1", "name3")

gene_names中的每一项对应genes中的每一项,具有相同的索引。

现在,我想在data中创建一个名为ToLabel的新列,如果data$Protein中的列值与genes匹配,则该列保存gene_names中的项。

data %>% mutate( ToLabel = ifelse( Protein %in% genes, genes_names, "no" ) )

这没有按预期工作。我的预期结果:

Protein ToLabel
gene1   name1
gene2   no
gene3   name3

使用recode:

data %>%
mutate(Protein = str_squish(Protein),
ToLabel = recode(Protein, !!!set_names(genes_names, genes), .default = 'no'))
Protein ToLabel
1   gene1   name1
2   gene2      no
3   gene3   name3

如果要通过匹配

来替换多个值,则使用连接
library(dplyr)
data %>%
mutate(Protein = trimws(Protein)) %>% 
left_join(tibble(Protein = genes, ToLabel = genes_names)) %>%
mutate(ToLabel = coalesce(ToLabel, "no"))

与产出

Protein ToLabel
1   gene1   name1
2   gene2      no
3   gene3   name3

你可以使用你的代码做一些修改

library( tidyverse )
data |> rowwise() |> mutate(Protein = trimws(c_across()) ,
ToLabel = ifelse( c_across() %in% genes, genes_names[which(c_across() == genes)],
"no" ) ) |> ungroup()
输出
# A tibble: 3 × 2
Protein ToLabel
<chr>   <chr>  
1 gene1   name1  
2 gene2   no     
3 gene3   name3  

使用merge+replace的base R选项

transform(
merge(
transform(data, Protein = trimws(Protein)),
data.frame(
genes = c("gene1", "gene3"),
genes_names = c("name1", "name3")
),
by.x = "Protein",
by.y = "genes",
all.x = TRUE
),
genes_names = replace(genes_names, is.na(genes_names), "no")
)

Protein genes_names
1   gene1       name1
2   gene2          no
3   gene3       name3

您可以使用match():

ToLabel <- genes_names[match(trimws(data$Protein), genes)]
ToLabel[is.na(ToLabel)] <- "no"
data$ToLabel <- ToLabel
data
#>            Protein ToLabel
#> 1            gene1   name1
#> 2            gene2      no
#> 3            gene3   name3

相关内容

最新更新