r-如何在管道中重命名mlr3任务功能值



我有一个mlr3任务

df <- data.frame(v1 = c("a", "b", "a"),
v2 = c(1, 2, 2),
data = c(3.15, 4.11, 3.56))
library(mlr3)
task <- TaskRegr$new("bmsp", df, target = "data")

如何将功能重命名为";v1";值";a";到值"0";c";管道内?

代码:

library(mlr3)
library(mlr3pipelines)
df <- data.frame(v1 = c("a", "b", "a"),
v2 = c(1, 2, 2),
data = c(3.15, 4.11, 3.56))
library(mlr3)
task <- TaskRegr$new("bmsp", df, target = "data")

pop <- po("colapply",
applicator =  function(x) ifelse(x == "a", "c", x))

pop$param_set$values$affect_columns = selector_name("v1")
pop$train(list(task))[[1]]$data()

给出输出(见第v1列第2行(:

data v1 v2
1 3.15 c  1 
2 4.11 2  2 
3 3.56 c  2 

但需要输出

data v1 v2
1 3.15 c  1 
2 4.11 b  2 
3 3.56 c  2 

使用PipeOpColApply非常简单。

我们需要定义一个函数,该函数将接受所提供的输入并执行所请求的操作(涂抹器(。

library(mlr3)
library(mlr3pipelines)
pop <- po("colapply",
applicator =  function(x) ifelse(x == "a", "c", x))

我们还需要定义函数将在哪些列上运行:

pop$param_set$values$affect_columns = selector_name("v1")
pop$train(list(task))[[1]]$data()
#output
data v1 v2
1: 3.15  c  1
2: 4.11  b  2
3: 3.56  c  2

这与函数帮助中的示例非常相似。

数据:

df <- data.frame(v1 = c("a", "b", "a"),
v2 = c(1, 2, 2),
data = c(3.15, 4.11, 3.56))
task <- TaskRegr$new("bmsp", df, target = "data")

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
Random number generation:
RNG:     Mersenne-Twister 
Normal:  Inversion 
Sample:  Rounding 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    
attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] mlr3pipelines_0.3.0-9000 mlr3_0.7.0               Biostrings_2.56.0        XVector_0.28.0           IRanges_2.22.2           S4Vectors_0.26.1         BiocGenerics_0.34.0     
loaded via a namespace (and not attached):
[1] Biobase_2.48.0       httr_1.4.2           bit64_4.0.5          splines_4.0.2        foreach_1.5.0        prodlim_2019.11.13   assertthat_0.2.1     lgr_0.3.4            askpass_1.1         
[10] BiocFileCache_1.12.1 blob_1.2.1           mlr3misc_0.5.0       progress_1.2.2       ipred_0.9-9          backports_1.1.10     pillar_1.4.6         RSQLite_2.2.1        lattice_0.20-41     
[19] glue_1.4.2           uuid_0.1-4           pROC_1.16.2          digest_0.6.25        checkmate_2.0.0      colorspace_1.4-1     recipes_0.1.13       Matrix_1.2-18        plyr_1.8.6          
[28] timeDate_3043.102    XML_3.99-0.5         pkgconfig_2.0.3      biomaRt_2.44.1       caret_6.0-86         zlibbioc_1.34.0      purrr_0.3.4          scales_1.1.1         gower_0.2.2         
[37] lava_1.6.8           tibble_3.0.3         openssl_1.4.3        generics_0.0.2       ggplot2_3.3.2        ellipsis_0.3.1       withr_2.3.0          nnet_7.3-14          paradox_0.4.0-9000  
[46] survival_3.1-12      magrittr_1.5         crayon_1.3.4         memoise_1.1.0        nlme_3.1-148         MASS_7.3-51.6        class_7.3-17         tools_4.0.2          data.table_1.13.0   
[55] prettyunits_1.1.1    hms_0.5.3            lifecycle_0.2.0      stringr_1.4.0        munsell_0.5.0        glmnet_4.0-2         AnnotationDbi_1.50.3 compiler_4.0.2       tinytex_0.26        
[64] rlang_0.4.7          grid_4.0.2           iterators_1.0.12     rstudioapi_0.11      rappdirs_0.3.1       gtable_0.3.0         ModelMetrics_1.2.2.2 codetools_0.2-16     DBI_1.1.0           
[73] curl_4.3             reshape2_1.4.4       R6_2.4.1             lubridate_1.7.9      dplyr_1.0.2          bit_4.0.4            biomartr_0.9.2       shape_1.4.5          stringi_1.5.3       
[82] Rcpp_1.0.5           vctrs_0.3.4          rpart_4.1-15         dbplyr_1.4.4         tidyselect_1.1.0     xfun_0.18           

相关内容

  • 没有找到相关文章

最新更新