我有以下表格:
library( tidyverse )
data = read.table(text="gene1
gene2
gene3", , sep="t", col.names = c("Protein"))
和以下两个列表:
genes = c("gene1", "gene3")
genes_names = c("name1", "name3")
gene_names
中的每一项对应genes
中的每一项,具有相同的索引。
现在,我想在data
中创建一个名为ToLabel
的新列,如果data$Protein
中的列值与genes
匹配,则该列保存gene_names
中的项。
data %>% mutate( ToLabel = ifelse( Protein %in% genes, genes_names, "no" ) )
这没有按预期工作。我的预期结果:
Protein ToLabel
gene1 name1
gene2 no
gene3 name3
使用recode
:
data %>%
mutate(Protein = str_squish(Protein),
ToLabel = recode(Protein, !!!set_names(genes_names, genes), .default = 'no'))
Protein ToLabel
1 gene1 name1
2 gene2 no
3 gene3 name3
如果要通过匹配
来替换多个值,则使用连接library(dplyr)
data %>%
mutate(Protein = trimws(Protein)) %>%
left_join(tibble(Protein = genes, ToLabel = genes_names)) %>%
mutate(ToLabel = coalesce(ToLabel, "no"))
与产出
Protein ToLabel
1 gene1 name1
2 gene2 no
3 gene3 name3
你可以使用你的代码做一些修改
library( tidyverse )
data |> rowwise() |> mutate(Protein = trimws(c_across()) ,
ToLabel = ifelse( c_across() %in% genes, genes_names[which(c_across() == genes)],
"no" ) ) |> ungroup()
输出# A tibble: 3 × 2
Protein ToLabel
<chr> <chr>
1 gene1 name1
2 gene2 no
3 gene3 name3
使用merge
+replace
的base R选项
transform(
merge(
transform(data, Protein = trimws(Protein)),
data.frame(
genes = c("gene1", "gene3"),
genes_names = c("name1", "name3")
),
by.x = "Protein",
by.y = "genes",
all.x = TRUE
),
genes_names = replace(genes_names, is.na(genes_names), "no")
)
为
Protein genes_names
1 gene1 name1
2 gene2 no
3 gene3 name3
您可以使用match()
:
ToLabel <- genes_names[match(trimws(data$Protein), genes)]
ToLabel[is.na(ToLabel)] <- "no"
data$ToLabel <- ToLabel
data
#> Protein ToLabel
#> 1 gene1 name1
#> 2 gene2 no
#> 3 gene3 name3