如何将两列转换为0和1语句

我有一个数据帧，其中的列如下：

Patient   Gene
1         A
1         B
2         A
2         B
2         C
3         A
3         C

我想取消搜索，所以每个患者都是一行，每个基因都是一列，比如在中

GeneA   GeneB   GeneC
Patient1      1       1       0
Patient2      1       1       1 
Patient3      1       0       1

使用pivot_wider:

library(tidyr)
library(dplyr)
df %>% 
mutate(value = 1) %>% 
pivot_wider(names_from = Gene, values_fill = 0, names_prefix = "Gene")

输出

# A tibble: 3 × 4
Patient GeneA GeneB GeneC
<int> <dbl> <dbl> <dbl>
1       1     1     1     0
2       2     1     1     1
3       3     1     0     1

fastDummies::dummy_cols:的另一个选项

library(fastDummies)
df %>% 
dummy_cols("Gene", remove_selected_columns = TRUE) %>% 
group_by(Patient) %>% 
summarise(across(everything(), max))

library(data.table)
df <- data.frame(
stringsAsFactors = FALSE,
Patient = c(1L, 1L, 2L, 2L, 2L, 3L, 3L),
Gene = c("A", "B", "A", "B", "C", "A", "C")
)
df
#>   Patient Gene
#> 1       1    A
#> 2       1    B
#> 3       2    A
#> 4       2    B
#> 5       2    C
#> 6       3    A
#> 7       3    C
setDT(df)
dcast(
data = df,
formula = Patient ~ paste("Gene", Gene),
fun.aggregate = function(x) sum(!is.na(x))
)
#> Using 'Gene' as value column. Use 'value.var' to override
#>    Patient Gene A Gene B Gene C
#> 1:       1      1      1      0
#> 2:       2      1      1      1
#> 3:       3      1      0      1

^{创建于2022-10-06，reprex v2.0.2}

相关内容

最新更新

热门标签：