使用 Scikit-Learn 在 Python 中为随机森林绘制树

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40155128/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:13:07  来源:igfitidea点击:

Plot trees for a Random Forest in Python with Scikit-Learn

pythontreescikit-learnrandom-forestpydot

提问by Zoya

I want to plot a decision tree of a random forest. So, i create the following code:

我想绘制随机森林的决策树。所以,我创建了以下代码:

clf = RandomForestClassifier(n_estimators=100)
import pydotplus
import six
from sklearn import tree
dotfile = six.StringIO()
i_tree = 0
for tree_in_forest in clf.estimators_:
if (i_tree <1):        
    tree.export_graphviz(tree_in_forest, out_file=dotfile)
    pydotplus.graph_from_dot_data(dotfile.getvalue()).write_png('dtree'+ str(i_tree) +'.png')
    i_tree = i_tree + 1

But it doesn't generate anything.. Have you an idea how to plot a decision tree from random forest ?

但它不会产生任何东西......你知道如何从随机森林中绘制决策树吗?

Thank you,

谢谢,

回答by user6903745

Assuming your Random Forest model is already fitted, first you should first import the export_graphvizfunction:

假设您的随机森林模型已经拟合,首先您应该首先导入export_graphviz函数:

from sklearn.tree import export_graphviz

In your for cycle you could do the following to generate the dotfile

在您的 for 循环中,您可以执行以下操作来生成dot文件

export_graphviz(tree_in_forest,
                feature_names=X.columns,
                filled=True,
                rounded=True)

The next line generates a png file

下一行生成一个png文件

os.system('dot -Tpng tree.dot -o tree.png')

回答by Michael James Kali Galarnyk

After you fit a random forest model in scikit-learn, you can visualize individual decision trees from a random forest. The code below first fits a random forest model.

在 scikit-learn 中拟合随机森林模型后,您可以可视化随机森林中的各个决策树。下面的代码首先适合随机森林模型。

import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn import tree
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Load the Breast Cancer Dataset
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

# Arrange Data into Features Matrix and Target Vector
X = df.loc[:, df.columns != 'target']
y = df.loc[:, 'target'].values

# Split the data into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X, y, random_state=0)

# Random Forests in `scikit-learn` (with N = 100)
rf = RandomForestClassifier(n_estimators=100,
                            random_state=0)
rf.fit(X_train, Y_train)

You can now visualize individual trees. The code below visualizes the first decision tree.

您现在可以可视化单个树。下面的代码可视化了第一个决策树。

fn=data.feature_names
cn=data.target_names
fig, axes = plt.subplots(nrows = 1,ncols = 1,figsize = (4,4), dpi=800)
tree.plot_tree(rf.estimators_[0],
               feature_names = fn, 
               class_names=cn,
               filled = True);
fig.savefig('rf_individualtree.png')

The image below is what is saved.

下图是保存的内容。

enter image description here

在此处输入图片说明

Because this question asked for trees, you can visualize all the estimators (decision trees) from a random forest if you like. The code below visualizes the first 5 from the random forest model fit above.

因为这个问题要求树,如果你愿意,你可以从随机森林中可视化所有的估计器(决策树)。下面的代码可视化了上面拟合的随机森林模型中的前 5 个。

# This may not the best way to view each estimator as it is small
fn=data.feature_names
cn=data.target_names
fig, axes = plt.subplots(nrows = 1,ncols = 5,figsize = (10,2), dpi=900)
for index in range(0, 5):
    tree.plot_tree(rf.estimators_[index],
                   feature_names = fn, 
                   class_names=cn,
                   filled = True,
                   ax = axes[index]);

    axes[index].set_title('Estimator: ' + str(index), fontsize = 11)
fig.savefig('rf_5trees.png')

The image below is what is saved.

下图是保存的内容。

enter image description here

在此处输入图片说明

The code was adapted from this post.

代码改编自这篇文章

回答by Samiran Bera

you can view each tree like this,

你可以像这样查看每棵树,

i_tree = 0
for tree_in_forest in FT_cls_gini.estimators_:
    if (i_tree ==3):        
        tree.export_graphviz(tree_in_forest, out_file=dotfile)
        graph = pydotplus.graph_from_dot_data(dotfile.getvalue())        
    i_tree = i_tree + 1
Image(graph.create_png())

回答by Mirodil

You can draw a single tree:

您可以绘制一棵树:

from sklearn.tree import export_graphviz
from IPython import display
from sklearn.ensemble import RandomForestRegressor

m = RandomForestRegressor(n_estimators=1, max_depth=3, bootstrap=False, n_jobs=-1)
m.fit(X_train, y_train)

str_tree = export_graphviz(m, 
   out_file=None, 
   feature_names=X_train.columns, # column names
   filled=True,        
   special_characters=True, 
   rotate=True, 
   precision=0.6)

display.display(str_tree)