我正在研究机器学习&;《预测建模的专家技术》,作者:Brett Lantz。当我在r中尝试示例建模练习时,我正在使用tidymodels
套件。
我正在完成第5章,其中您使用C5.0算法构建决策树。我已经使用下面的代码创建了模型
c5_v1 <- C5_rules() %>%
set_mode('classification') %>%
set_engine('C5.0')
c5_res_1 <- fit(object = c5_v1, formula = default ~., data = credit_train)
操作成功:
parsnip model object
Call:
C5.0.default(x = x, y = y, trials = trials, rules = TRUE, control
= C50::C5.0Control(minCases = minCases, seed = sample.int(10^5, 1), earlyStopping
= FALSE))
Rule-Based Model
Number of samples: 900
Number of predictors: 20
Number of Rules: 22
Non-standard options: attempt to group attributes
尽我所能,谷歌尽我所能,阅读parsnips
文档等,我无法找到如何查看决策树。谁能告诉我如何查看它创建的实际树?
注意C5_rules()
是规则拟合模型的规范。因此,在与C5_rules()
拟合之后,您不应该期望输出是一棵决策树,而是一组规则。
使用C5.0引擎,您可以同时获得决策树输出和规则输出。对于拟合的模型,运行extract_fit_engine()
以获得嵌入在防风草模型拟合中的引擎特定拟合,然后运行summary()
以提取输出。
library(tidymodels)
library(rules)
#>
#> Attaching package: 'rules'
#> The following object is masked from 'package:dials':
#>
#> max_rules
data(penguins, package = "modeldata")
#model specification
C5_decision_tree <- decision_tree() |>
set_engine("C5.0") |>
set_mode("classification")
C5_rules <- C5_rules() |>
#no need to set engine because only C5.0 is used for C5_rules()
#verify with show_engines("C5_rules")
set_mode("classification")
#fitting the models
C5_decision_tree_fitted <- C5_decision_tree |>
fit(species ~ ., data = penguins)
C5_rules_fitted <- C5_rules |>
fit(species ~ ., data = penguins)
#extracting decision tree
C5_decision_tree_fitted |>
extract_fit_engine() |>
summary()
#>
#> Call:
#> C5.0.default(x = x, y = y, trials = 1, control = C50::C5.0Control(minCases =
#> 2, sample = 0))
#>
#>
#> C5.0 [Release 2.07 GPL Edition] Mon Jul 4 09:32:16 2022
#> -------------------------------
#>
#> Class specified by attribute `outcome'
#>
#> Read 333 cases (7 attributes) from undefined.data
#>
#> Decision tree:
#>
#> flipper_length_mm > 206:
#> :...island = Biscoe: Gentoo (118)
#> : island in {Dream,Torgersen}:
#> : :...bill_length_mm <= 46.5: Adelie (2)
#> : bill_length_mm > 46.5: Chinstrap (5)
#> flipper_length_mm <= 206:
#> :...bill_length_mm > 43.3:
#> :...island in {Biscoe,Torgersen}: Adelie (4/1)
#> : island = Dream: Chinstrap (59/1)
#> bill_length_mm <= 43.3:
#> :...bill_length_mm <= 42.3: Adelie (134/1)
#> bill_length_mm > 42.3:
#> :...sex = female: Chinstrap (4)
#> sex = male: Adelie (7)
#>
#>
#> Evaluation on training data (333 cases):
#>
#> Decision Tree
#> ----------------
#> Size Errors
#>
#> 8 3( 0.9%) <<
#>
#>
#> (a) (b) (c) <-classified as
#> ---- ---- ----
#> 145 1 (a): class Adelie
#> 1 67 (b): class Chinstrap
#> 1 118 (c): class Gentoo
#>
#>
#> Attribute usage:
#>
#> 100.00% flipper_length_mm
#> 64.56% bill_length_mm
#> 56.46% island
#> 3.30% sex
#>
#>
#> Time: 0.0 secs
#extracting rules
C5_rules_fitted |>
extract_fit_engine() |>
summary()
#>
#> Call:
#> C5.0.default(x = x, y = y, trials = trials, rules = TRUE, control
#> = C50::C5.0Control(minCases = minCases, seed = sample.int(10^5,
#> 1), earlyStopping = FALSE))
#>
#>
#> C5.0 [Release 2.07 GPL Edition] Mon Jul 4 09:32:16 2022
#> -------------------------------
#>
#> Class specified by attribute `outcome'
#>
#> Read 333 cases (7 attributes) from undefined.data
#>
#> Rules:
#>
#> Rule 1: (68, lift 2.2)
#> bill_length_mm <= 43.3
#> sex = male
#> -> class Adelie [0.986]
#>
#> Rule 2: (208/64, lift 1.6)
#> flipper_length_mm <= 206
#> -> class Adelie [0.690]
#>
#> Rule 3: (48, lift 4.8)
#> island = Dream
#> bill_length_mm > 46.5
#> -> class Chinstrap [0.980]
#>
#> Rule 4: (34/1, lift 4.6)
#> bill_length_mm > 42.3
#> flipper_length_mm <= 206
#> sex = female
#> -> class Chinstrap [0.944]
#>
#> Rule 5: (118, lift 2.8)
#> island = Biscoe
#> flipper_length_mm > 206
#> -> class Gentoo [0.992]
#>
#> Default class: Adelie
#>
#>
#> Evaluation on training data (333 cases):
#>
#> Rules
#> ----------------
#> No Errors
#>
#> 5 2( 0.6%) <<
#>
#>
#> (a) (b) (c) <-classified as
#> ---- ---- ----
#> 146 (a): class Adelie
#> 1 67 (b): class Chinstrap
#> 1 118 (c): class Gentoo
#>
#>
#> Attribute usage:
#>
#> 97.90% flipper_length_mm
#> 49.85% island
#> 40.84% bill_length_mm
#> 30.63% sex
#>
#>
#> Time: 0.0 secs
由reprex包(v2.0.1)创建于2022-07-04