可视化决策树(来自scikit-learn的示例)



我是使用sciki-learn的菜鸟,所以请耐心等待。

我正在经历这个例子:http://scikit-learn.org/stable/modules/tree.html#tree

>>> from sklearn.datasets import load_iris
>>> from sklearn import tree
>>> iris = load_iris()
>>> clf = tree.DecisionTreeClassifier()
>>> clf = clf.fit(iris.data, iris.target)
>>> from StringIO import StringIO
>>> out = StringIO()
>>> out = tree.export_graphviz(clf, out_file=out)

显然,graphiz文件已准备好使用。

但是如何使用 graphiz 文件绘制树呢?(该示例没有详细说明树是如何绘制的)。

示例代码和提示非常受欢迎!

谢谢!


更新

我正在使用 ubuntu 12.04,Python 2.7.3

你运行的是哪个操作系统?您是否安装了graphviz

在您的示例中,StringIO()对象保存图形可视化数据,以下是检查数据的一种方法:

...
>>> print out.getvalue()
digraph Tree {
0 [label="X[2] <= 2.4500nerror = 0.666667nsamples = 150nvalue = [ 50.  50.  50.]", shape="box"] ;
1 [label="error = 0.0000nsamples = 50nvalue = [ 50.   0.   0.]", shape="box"] ;
0 -> 1 ;
2 [label="X[3] <= 1.7500nerror = 0.5nsamples = 100nvalue = [  0.  50.  50.]", shape="box"] ;
0 -> 2 ;
3 [label="X[2] <= 4.9500nerror = 0.168038nsamples = 54nvalue = [  0.  49.   5.]", shape="box"] ;
2 -> 3 ;
4 [label="X[3] <= 1.6500nerror = 0.0407986nsamples = 48nvalue = [  0.  47.   1.]", shape="box"] ;
3 -> 4 ;
5 [label="error = 0.0000nsamples = 47nvalue = [  0.  47.   0.]", shape="box"] ;
4 -> 5 ;
6 [label="error = 0.0000nsamples = 1nvalue = [ 0.  0.  1.]", shape="box"] ;
4 -> 6 ;
7 [label="X[3] <= 1.5500nerror = 0.444444nsamples = 6nvalue = [ 0.  2.  4.]", shape="box"] ;
3 -> 7 ;
8 [label="error = 0.0000nsamples = 3nvalue = [ 0.  0.  3.]", shape="box"] ;
7 -> 8 ;
9 [label="X[0] <= 6.9500nerror = 0.444444nsamples = 3nvalue = [ 0.  2.  1.]", shape="box"] ;
7 -> 9 ;
10 [label="error = 0.0000nsamples = 2nvalue = [ 0.  2.  0.]", shape="box"] ;
9 -> 10 ;
11 [label="error = 0.0000nsamples = 1nvalue = [ 0.  0.  1.]", shape="box"] ;
9 -> 11 ;
12 [label="X[2] <= 4.8500nerror = 0.0425331nsamples = 46nvalue = [  0.   1.  45.]", shape="box"] ;
2 -> 12 ;
13 [label="X[0] <= 5.9500nerror = 0.444444nsamples = 3nvalue = [ 0.  1.  2.]", shape="box"] ;
12 -> 13 ;
14 [label="error = 0.0000nsamples = 1nvalue = [ 0.  1.  0.]", shape="box"] ;
13 -> 14 ;
15 [label="error = 0.0000nsamples = 2nvalue = [ 0.  0.  2.]", shape="box"] ;
13 -> 15 ;
16 [label="error = 0.0000nsamples = 43nvalue = [  0.   0.  43.]", shape="box"] ;
12 -> 16 ;
}

您可以将其编写为 .dot 文件并生成图像输出,如您链接的源所示:

$ dot -Tpng tree.dot -o tree.png(PNG 格式输出)

你离得很近!只需做:

graph_from_dot_data(out.getvalue()).write_pdf("somefile.pdf")

相关内容

  • 没有找到相关文章

最新更新