如何在R中从这个防风草模型中提取分类树?



我正在研究机器学习&;《预测建模的专家技术》,作者:Brett Lantz。当我在r中尝试示例建模练习时,我正在使用tidymodels套件。

我正在完成第5章,其中您使用C5.0算法构建决策树。我已经使用下面的代码创建了模型

c5_v1 <- C5_rules() %>% 
set_mode('classification') %>% 
set_engine('C5.0')

c5_res_1 <- fit(object = c5_v1, formula = default ~., data = credit_train)

操作成功:

parsnip model object

Call:
C5.0.default(x = x, y = y, trials = trials, rules = TRUE, control
= C50::C5.0Control(minCases = minCases, seed = sample.int(10^5, 1), earlyStopping
= FALSE))
Rule-Based Model
Number of samples: 900 
Number of predictors: 20 
Number of Rules: 22 
Non-standard options: attempt to group attributes

尽我所能,谷歌尽我所能,阅读parsnips文档等,我无法找到如何查看决策树。谁能告诉我如何查看它创建的实际树?

注意C5_rules()是规则拟合模型的规范。因此,在与C5_rules()拟合之后,您不应该期望输出是一棵决策树,而是一组规则。

使用C5.0引擎,您可以同时获得决策树输出和规则输出。对于拟合的模型,运行extract_fit_engine()以获得嵌入在防风草模型拟合中的引擎特定拟合,然后运行summary()以提取输出。

library(tidymodels)
library(rules)
#> 
#> Attaching package: 'rules'
#> The following object is masked from 'package:dials':
#> 
#>     max_rules
data(penguins, package = "modeldata")
#model specification
C5_decision_tree <- decision_tree() |> 
set_engine("C5.0") |> 
set_mode("classification")
C5_rules <- C5_rules() |> 
#no need to set engine because only C5.0 is used for C5_rules()
#verify with show_engines("C5_rules")
set_mode("classification")
#fitting the models
C5_decision_tree_fitted <- C5_decision_tree |> 
fit(species ~ ., data = penguins)
C5_rules_fitted <- C5_rules |> 
fit(species ~ ., data = penguins)
#extracting decision tree
C5_decision_tree_fitted |> 
extract_fit_engine() |> 
summary()
#> 
#> Call:
#> C5.0.default(x = x, y = y, trials = 1, control = C50::C5.0Control(minCases =
#>  2, sample = 0))
#> 
#> 
#> C5.0 [Release 2.07 GPL Edition]      Mon Jul  4 09:32:16 2022
#> -------------------------------
#> 
#> Class specified by attribute `outcome'
#> 
#> Read 333 cases (7 attributes) from undefined.data
#> 
#> Decision tree:
#> 
#> flipper_length_mm > 206:
#> :...island = Biscoe: Gentoo (118)
#> :   island in {Dream,Torgersen}:
#> :   :...bill_length_mm <= 46.5: Adelie (2)
#> :       bill_length_mm > 46.5: Chinstrap (5)
#> flipper_length_mm <= 206:
#> :...bill_length_mm > 43.3:
#>     :...island in {Biscoe,Torgersen}: Adelie (4/1)
#>     :   island = Dream: Chinstrap (59/1)
#>     bill_length_mm <= 43.3:
#>     :...bill_length_mm <= 42.3: Adelie (134/1)
#>         bill_length_mm > 42.3:
#>         :...sex = female: Chinstrap (4)
#>             sex = male: Adelie (7)
#> 
#> 
#> Evaluation on training data (333 cases):
#> 
#>      Decision Tree   
#>    ----------------  
#>    Size      Errors  
#> 
#>       8    3( 0.9%)   <<
#> 
#> 
#>     (a)   (b)   (c)    <-classified as
#>    ----  ----  ----
#>     145     1          (a): class Adelie
#>       1    67          (b): class Chinstrap
#>       1         118    (c): class Gentoo
#> 
#> 
#>  Attribute usage:
#> 
#>  100.00% flipper_length_mm
#>   64.56% bill_length_mm
#>   56.46% island
#>    3.30% sex
#> 
#> 
#> Time: 0.0 secs
#extracting rules
C5_rules_fitted |> 
extract_fit_engine() |> 
summary()
#> 
#> Call:
#> C5.0.default(x = x, y = y, trials = trials, rules = TRUE, control
#>  = C50::C5.0Control(minCases = minCases, seed = sample.int(10^5,
#>  1), earlyStopping = FALSE))
#> 
#> 
#> C5.0 [Release 2.07 GPL Edition]      Mon Jul  4 09:32:16 2022
#> -------------------------------
#> 
#> Class specified by attribute `outcome'
#> 
#> Read 333 cases (7 attributes) from undefined.data
#> 
#> Rules:
#> 
#> Rule 1: (68, lift 2.2)
#>  bill_length_mm <= 43.3
#>  sex = male
#>  ->  class Adelie  [0.986]
#> 
#> Rule 2: (208/64, lift 1.6)
#>  flipper_length_mm <= 206
#>  ->  class Adelie  [0.690]
#> 
#> Rule 3: (48, lift 4.8)
#>  island = Dream
#>  bill_length_mm > 46.5
#>  ->  class Chinstrap  [0.980]
#> 
#> Rule 4: (34/1, lift 4.6)
#>  bill_length_mm > 42.3
#>  flipper_length_mm <= 206
#>  sex = female
#>  ->  class Chinstrap  [0.944]
#> 
#> Rule 5: (118, lift 2.8)
#>  island = Biscoe
#>  flipper_length_mm > 206
#>  ->  class Gentoo  [0.992]
#> 
#> Default class: Adelie
#> 
#> 
#> Evaluation on training data (333 cases):
#> 
#>          Rules     
#>    ----------------
#>      No      Errors
#> 
#>       5    2( 0.6%)   <<
#> 
#> 
#>     (a)   (b)   (c)    <-classified as
#>    ----  ----  ----
#>     146                (a): class Adelie
#>       1    67          (b): class Chinstrap
#>             1   118    (c): class Gentoo
#> 
#> 
#>  Attribute usage:
#> 
#>   97.90% flipper_length_mm
#>   49.85% island
#>   40.84% bill_length_mm
#>   30.63% sex
#> 
#> 
#> Time: 0.0 secs

由reprex包(v2.0.1)创建于2022-07-04

相关内容

  • 没有找到相关文章

最新更新