Python 在 scikit-learn 中可视化决策树

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27817994/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:19:25  来源:igfitidea点击:

Visualizing decision tree in scikit-learn

pythonscikit-learnvisualizationdecision-tree

提问by Ravi

I am trying to design a simple Decision Tree using scikit-learn in Python (I am using Anaconda's Ipython Notebook with Python 2.7.3 on Windows OS) and visualize it as follows:

我正在尝试使用 Python 中的 scikit-learn 设计一个简单的决策树(我在 Windows 操作系统上使用带有 Python 2.7.3 的 Anaconda 的 Ipython Notebook)并将其可视化如下:

from pandas import read_csv, DataFrame
from sklearn import tree
from os import system

data = read_csv('D:/training.csv')
Y = data.Y
X = data.ix[:,"X0":"X33"]

dtree = tree.DecisionTreeClassifier(criterion = "entropy")
dtree = dtree.fit(X, Y)

dotfile = open("D:/dtree2.dot", 'w')
dotfile = tree.export_graphviz(dtree, out_file = dotfile, feature_names = X.columns)
dotfile.close()
system("dot -Tpng D:.dot -o D:/dtree2.png")

However, I get the following error:

但是,我收到以下错误:

AttributeError: 'NoneType' object has no attribute 'close'

I use the following blog post as reference: Blogpost link

我使用以下博客文章作为参考:Blogpost 链接

The following stackoverflow question doesn't seem to work for me as well: Question

以下stackoverflow问题似乎对我也不起作用:问题

Could someone help me with how to visualize the decision tree in scikit-learn?

有人可以帮助我如何在 scikit-learn 中可视化决策树吗?

采纳答案by Ffisegydd

sklearn.tree.export_graphvizdoesn't return anything, and so by default returns None.

sklearn.tree.export_graphviz不返回任何内容,因此默认情况下返回None.

By doing dotfile = tree.export_graphviz(...)you overwrite your open file object, which had been previously assigned to dotfile, so you get an error when you try to close the file (as it's now None).

通过这样做,dotfile = tree.export_graphviz(...)您会覆盖之前分配给的打开文件对象,dotfile因此当您尝试关闭文件时会出现错误(就像现在一样None)。

To fix it change your code to

要修复它,请将您的代码更改为

...
dotfile = open("D:/dtree2.dot", 'w')
tree.export_graphviz(dtree, out_file = dotfile, feature_names = X.columns)
dotfile.close()
...

回答by CSquare

Alternatively, you could try using pydot for producing the png file from dot:

或者,您可以尝试使用 pydot 从 dot 生成 png 文件:

...
tree.export_graphviz(dtreg, out_file='tree.dot') #produces dot file

import pydot
dotfile = StringIO()
tree.export_graphviz(dtreg, out_file=dotfile)
pydot.graph_from_dot_data(dotfile.getvalue()).write_png("dtree2.png")
...

回答by FrancoisHawaii

If, like me, you have a problem installing graphviz, you can visualize the tree by

如果像我一样,你在安装 graphviz 时遇到问题,你可以通过

  1. exporting it with export_graphvizas shown in previous answers
  2. Open the .dotfile in a text editor
  3. Copy the piece of code and paste it @ webgraphviz.com
  1. export_graphviz如先前的答案所示,将其导出
  2. .dot在文本编辑器中打开文件
  3. 复制这段代码并粘贴到@ webgraphviz.com

回答by saimadhu.polamuri

You can copy the contents of the export_graphviz file and you can paste the same in the webgraphviz.comsite.

您可以复制 export_graphviz 文件的内容,并将其粘贴到webgraphviz.com站点中。

You can check out the article on How to visualize the decision tree in Python with graphvizfor more information.

您可以查看关于如何使用 graphviz 在 Python 中可视化决策树的文章以获取更多信息。

回答by singer

Here is one liner for those who are using jupyterand sklearn(18.2+) You don't even need matplotlibfor that. Only requirement is graphviz

对于那些使用jupyter和 sklearn(18.2+) 的人来说,这是一个衬里,您甚至不需要matplotlib它。唯一的要求是graphviz

pip install graphviz

than run (according to code in question X is a pandas DataFrame)

比运行(根据有问题的代码 X 是一个 Pandas DataFrame)

from graphviz import Source
from sklearn import tree
Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns))

This will display it in SVG format. Code above produces Graphviz's Sourceobject (source_code- not scary) That would be rendered directly in jupyter.

这将以 SVG 格式显示它。上面的代码生成 Graphviz 的Source对象(source_code- 并不可怕),它将直接在 jupyter 中呈现。

Some things you are likely to do with it

你可能会用它做的一些事情

Display it in jupter:

在 jupter 中显示它:

from IPython.display import SVG
graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns))
SVG(graph.pipe(format='svg'))

Save as png:

另存为 png:

graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns))
graph.format = 'png'
graph.render('dtree_render',view=True)

Get the png image, save it and view it:

获取png图片,保存并查看:

graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns))
png_bytes = graph.pipe(format='png')
with open('dtree_pipe.png','wb') as f:
    f.write(png_bytes)

from IPython.display import Image
Image(png_bytes)

If you are going to play with that lib here are the links to examplesand userguide

如果你要玩与LIB这里是链接到的实例userguide

回答by louis_guitton

If you run into issues with grabbing the source .dot directly you can also use Source.from_filelike this:

如果您在直接获取源 .dot 时遇到问题,您也可以Source.from_file这样使用:

from graphviz import Source
from sklearn import tree
tree.export_graphviz(dtreg, out_file='tree.dot', feature_names=X.columns)
Source.from_file('tree.dot')

回答by vuminh91

I copy and change a part of your code as the below:

我复制并更改您的代码的一部分,如下所示:

from pandas import read_csv, DataFrame
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
from os import system

data = read_csv('D:/training.csv')
Y = data.Y
X = data.ix[:,"X0":"X33"]

dtree = tree.DecisionTreeClassifier(criterion = "entropy")
dtree = dtree.fit(X, Y)

After making sure you have dtree, which means that the above code runs well, you add the below code to visualize decision tree:

在确定你有 dtree 后,这意味着上面的代码运行良好,你添加下面的代码来可视化决策树:

Remember to install graphviz first: pip install graphviz

记得先安装graphviz:pip install graphviz

import graphviz 
from graphviz import Source
dot_data = tree.export_graphviz(dtree, out_file=None, feature_names=X.columns)
graph = graphviz.Source(dot_data) 
graph.render("name of file",view = True)

I tried with my data, visualization worked well and I got a pdf file viewed immediately.

我尝试使用我的数据,可视化效果很好,我立即看到了一个 pdf 文件。

回答by Jeril

The following also works fine:

以下也可以正常工作:

from sklearn.datasets import load_iris
iris = load_iris()

# Model (can also use single decision tree)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=10)

# Train
model.fit(iris.data, iris.target)
# Extract single tree
estimator = model.estimators_[5]

from sklearn.tree import export_graphviz
# Export as dot file
export_graphviz(estimator, out_file='tree.dot', 
                feature_names = iris.feature_names,
                class_names = iris.target_names,
                rounded = True, proportion = False, 
                precision = 2, filled = True)

# Convert to png using system command (requires Graphviz)
from subprocess import call
call(['dot', '-Tpng', 'tree.dot', '-o', 'tree.png', '-Gdpi=600'])

# Display in jupyter notebook
from IPython.display import Image
Image(filename = 'tree.png')

enter image description here

在此处输入图片说明

You can find the source here

你可以在这里找到来源

回答by Alexey Shrub

Simple way founded herewith pydotplus(graphviz must be installed):

使用pydotplus在这里创建的简单方法(必须安装graphviz):

from IPython.display import Image  
from sklearn import tree
import pydotplus # installing pyparsing maybe needed

...

...

dot_data = tree.export_graphviz(best_model, out_file=None, feature_names = X.columns)
graph = pydotplus.graph_from_dot_data(dot_data)
Image(graph.create_png())

回答by yzerman

Scikit learn recently introduced the plot_treemethod to make this very easy (new in version 0.21 (May 2019)). Documentation here.

Scikit learn 最近引入了plot_tree使这变得非常简单的方法(0.21 版(2019 年 5 月)中的新功能)。文档在这里

Here's the minimum code you need:

这是您需要的最少代码:

from sklearn import tree
plt.figure(figsize=(40,20))  # customize according to the size of your tree
_ = tree.plot_tree(your_model_name, feature_names = X.columns)
plt.show()

plot_treesupports some arguments to beautify the tree. For example:

plot_tree支持一些美化树的论据。例如:

from sklearn import tree
plt.figure(figsize=(40,20))  
_ = tree.plot_tree(your_model_name, feature_names = X.columns, 
             filled=True, fontsize=6, rounded = True)
plt.show()

If you want to save the picture to a file, add the following line before plt.show():

如果要将图片保存到文件,请在前面添加以下行plt.show()

plt.savefig('filename.png')

If you want to view the rules in text format, there's an answer here. It's more intuitive to read.

如果你想在文本格式查看的规则,有一个答案在这里。读起来更直观。