我有input
数据集,我正在寻找通过重新编码1
作为列的名称和0
作为NA
来生成output
数据集。我设法手动做到这一点,见下面的Not optional solution
。但是我有一个包含数百列的数据集,所以我正在寻找一种方法来自动化这个过程。
library(tibble)
library(dplyr)
输入input <- tibble( a = c(1, 0, 0, 1, 0),
b = c(0, 0, 0, 1, 1),
c = c(1, 1, 1, 1, 1),
d = c(0, 0, 0, 0, 0))
# # A tibble: 5 × 4
# a b c d
# <dbl> <dbl> <dbl> <dbl>
# 1 1 0 1 0
# 2 0 0 1 0
# 3 0 0 1 0
# 4 1 1 1 0
# 5 0 1 1 0
输出output <- tibble( a = c("a", NA, NA, "a", NA),
b = c(NA, NA, NA, "b", NA),
c = c("c", "c", "c", "c", "c"),
d = c(NA, NA, NA, NA, NA))
# # A tibble: 5 × 4
# a b c d
# <chr> <chr> <chr> <lgl>
# 1 a NA c NA
# 2 NA NA c NA
# 3 NA NA c NA
# 4 a b c NA
# 5 NA NA c NA
非可选解决方案
input %>%
mutate(a = case_when(a == 1 ~ "a",
T ~ NA_character_),
b = case_when(b == 1 ~ "b",
T ~ NA_character_),
c = case_when(c == 1 ~ "c",
T ~ NA_character_),
d = case_when(d == 1 ~ "d",
T ~ NA_character_))
我们可以使用across
和ifelse
语句:
library(dplyr)
input %>%
mutate(across(everything(), ~ifelse(. == 1, cur_column(), NA)))
a b c d
<chr> <chr> <chr> <lgl>
1 a NA c NA
2 NA NA c NA
3 NA NA c NA
4 a b c NA
5 NA b c NA