我需要使用tidymodels中的recipes包创建一个recipe。在其中一个步骤中,我需要将有序因子转换为它们的序数分数。但我似乎没有什么功能使用以选择所有有序因子。
我知道有一个叫做all_nominal()
的函数,但它匹配作为因子的每一列,可以是有序的,也可以是无序的。我也尝试过has_type("ordered")
,但也不起作用。
目前,我必须手动输入列名。有更简单的方法吗?
下面是我想做的一个例子:
library(mlbench)
data("BreastCancer")
rec <- recipe(Class ~ ., BreastCancer) %>%
# Here, I want to select all ordered nominals instead of
# listing them by name. Should there be a function
# all_ordinal in recipes? Or is there another way
# to accomplish this?
step_ordinalscore(Cl.thickness,
Cell.size,
Cell.shape,
Marg.adhesion,
Epith.c.size)
欢迎任何帮助,谢谢。
对于有序因子没有特殊的选择器函数,但您可以自己找到它们,然后将名称向量用于step_ordinalscore()
。
library(recipes)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#>
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#>
#> step
library(mlbench)
data("BreastCancer")
## find all the ordered factors
ordered_names <- BreastCancer %>%
select(where(is.ordered)) %>%
names()
ordered_names
#> [1] "Cl.thickness" "Cell.size" "Cell.shape" "Marg.adhesion"
#> [5] "Epith.c.size"
## convert all the ordered factors to scores
rec <- recipe(Class ~ ., BreastCancer) %>%
step_ordinalscore(all_of(ordered_names))
rec
#> Data Recipe
#>
#> Inputs:
#>
#> role #variables
#> outcome 1
#> predictor 10
#>
#> Operations:
#>
#> Scoring for all_of(ordered_names)
prep(rec) %>% bake(new_data = NULL)
#> # A tibble: 699 x 11
#> Id Cl.thickness Cell.size Cell.shape Marg.adhesion Epith.c.size
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1000… 5 1 1 1 2
#> 2 1002… 5 4 4 5 7
#> 3 1015… 3 1 1 1 2
#> 4 1016… 6 8 8 1 3
#> 5 1017… 4 1 1 3 2
#> 6 1017… 8 10 10 8 7
#> 7 1018… 1 1 1 1 2
#> 8 1018… 2 1 2 1 2
#> 9 1033… 2 1 1 1 2
#> 10 1033… 4 2 1 1 2
#> # … with 689 more rows, and 5 more variables: Bare.nuclei <fct>,
#> # Bl.cromatin <fct>, Normal.nucleoli <fct>, Mitoses <fct>, Class <fct>
创建于2020-10-19由reprex包(v0.30.09001(