r语言 - 映射在输出目录中变化的爆炸行数内



假设我有如下数据:

d <- tibble::tribble(
~sit_comfy_sofa_1, ~sit_comfy_sofa_2, ~sit_comfy_sofa_3, ~sit_comfy_sofa_4, ~sit_comfy_couch_1, ~sit_comfy_couch_2, ~sit_comfy_couch_3, ~sit_comfy_couch_4, ~sit_comfy_settee_1, ~sit_comfy_settee_2, ~sit_comfy_settee_3, ~sit_comfy_settee_4,
1L,                0L,                0L,                0L,                 0L,                 1L,                 0L,                 0L,                  0L,                  0L,                  1L,                  0L,
0L,                0L,                0L,                1L,                 0L,                 0L,                 0L,                 1L,                  0L,                  1L,                  0L,                  0L,
0L,                1L,                0L,                0L,                 1L,                 0L,                 0L,                 0L,                  1L,                  0L,                  0L,                  0L,
0L,                0L,                1L,                0L,                 0L,                 0L,                 1L,                 0L,                  0L,                  0L,                  0L,                  1L
)

这个标题有三个列"类别",一个用于_sofa_,一个用于_couch_,一个用于_settee_。我正在尝试查看每个类别,并构建一个新的变量,该变量具有基于类别内的每个列是否== 1的条件值。

我写了这个函数来尝试:

cleaning_fcn <- function(.df, .x){
.df %>% 
mutate(!!sym(paste0("explain_", .x)) := case_when(
!!sym(paste0("sit_comfy_", .x ,"_1")) == 1 ~ "Just better",
!!sym(paste0("sit_comfy_", .x, "_2")) == 1 ~ "Nice shape",
!!sym(paste0("sit_comfy_", .x ,"_3")) == 1 ~ "Like the color",
!!sym(paste0("sit_comfy_", .x ,"_4")) == 1 ~ "Nice material"),
!!sym(paste0("explain_", .x)) := factor(!!sym(paste0("explain_", .x)), 
levels = c("Just better", "Nice shape",
"Like the color", "Nice material")))
}

然而,当我调用它时,我最终得到的标题是原始标题的3倍。

require(tidyverse)
purrr::map_dfr(
.x = tidyselect::all_of(c("sofa", "couch", "settee")),
.f = ~ cleaning_fcn(.df = d, .x))

有人能看出我错在哪里吗?

本质上,我想实现与下面代码相同的功能,但理想情况下,它应该是一个函数(并且通常具有更少的重复):

d <- d %>% 
mutate(explain_sofa = case_when(
sit_comfy_sofa_1 == 1 ~ "Just better",
sit_comfy_sofa_2 == 1 ~ "Nice shape",
sit_comfy_sofa_3 == 1 ~ "Like the color",
sit_comfy_sofa_4 == 1 ~ "Nice material"),
explain_sofa = factor(explain_sofa, levels = c("Just better", "Nice shape",
"Like the color", "Nice material")))
d <- d %>% 
mutate(explain_couch = case_when(
sit_couch_sofa_1 == 1 ~ "Just better",
sit_couch_sofa_2 == 1 ~ "Nice shape",
sit_couch_sofa_3 == 1 ~ "Like the color",
sit_couch_sofa_4 == 1 ~ "Nice material"),
explain_couch = factor(explain_couch, levels = c("Just better", "Nice shape",
"Like the color", "Nice material")))
d <- d %>% 
mutate(explain_settee = case_when(
sit_settee_sofa_1 == 1 ~ "Just better",
sit_settee_sofa_2 == 1 ~ "Nice shape",
sit_settee_sofa_3 == 1 ~ "Like the color",
sit_settee_sofa_4 == 1 ~ "Nice material"),
explain_settee = factor(explain_settee, levels = c("Just better", "Nice shape",
"Like the color", "Nice material")))

使用map_dfr,您正在创建数据帧的list,每个类别一个,然后按行绑定。因此,最终得到的数据帧的行数是原来的3倍。一种选择是使用purrr::reduce:

library(tidyverse)
purrr::reduce(.x = c("sofa", "couch", "settee"), .f = cleaning_fcn, .init = d)
#> # A tibble: 4 × 15
#>   sit_comfy_sofa_1 sit_comfy_sofa_2 sit_comfy_sofa_3 sit_comfy_sofa_4
#>              <int>            <int>            <int>            <int>
#> 1                1                0                0                0
#> 2                0                0                0                1
#> 3                0                1                0                0
#> 4                0                0                1                0
#> # ℹ 11 more variables: sit_comfy_couch_1 <int>, sit_comfy_couch_2 <int>,
#> #   sit_comfy_couch_3 <int>, sit_comfy_couch_4 <int>, sit_comfy_settee_1 <int>,
#> #   sit_comfy_settee_2 <int>, sit_comfy_settee_3 <int>,
#> #   sit_comfy_settee_4 <int>, explain_sofa <fct>, explain_couch <fct>,
#> #   explain_settee <fct>

最新更新