Python 从 scikit-learn 管道中获取模型属性

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28822756/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:48:07  来源:igfitidea点击:

Getting model attributes from scikit-learn pipeline

pythonscikit-learnneuraxle

提问by lmart999

I typically get PCAloadings like this:

我通常会得到这样的PCA负载:

pca = PCA(n_components=2)
X_t = pca.fit(X).transform(X)
loadings = pca.components_

If I run PCAusing a scikit-learnpipline ...

如果我PCA使用scikit-learn管道运行...

from sklearn.pipeline import Pipeline
pipeline = Pipeline(steps=[    
('scaling',StandardScaler()),
('pca',PCA(n_components=2))
])
X_t=pipeline.fit_transform(X)

... is it possible to get the loadings?

...是否有可能获得负载?

Simply trying loadings = pipeline.components_fails:

简单尝试loadings = pipeline.components_失败:

AttributeError: 'Pipeline' object has no attribute 'components_'

Thanks!

谢谢!

(Also interested in extracting attributes like coef_from learning pipelines.)

(也有兴趣coef_从学习管道中提取属性。)

采纳答案by Andreas Mueller

Did you look at the documentation: http://scikit-learn.org/dev/modules/pipeline.htmlI feel it is pretty clear.

你有没有看文档:http: //scikit-learn.org/dev/modules/pipeline.html我觉得很清楚。

Update: in 0.21 you can use just square brackets:

更新:在 0.21 中,您可以只使用方括号:

pipeline['pca']

or indices

或指数

pipeline[1]

There are two ways to get to the steps in a pipeline, either using indices or using the string names you gave:

有两种方法可以访问管道中的步骤,使用索引或使用您提供的字符串名称:

pipeline.named_steps['pca']
pipeline.steps[1][1]

This will give you the PCA object, on which you can get components. With named_stepsyou can also use attribute access with a .which allows autocompletion:

这将为您提供 PCA 对象,您可以在该对象上获取组件。随着named_steps你也可以使用带有属性的访问.,它允许自动完成:

pipeline.names_steps.pca.

pipeline.names_steps.pca。

回答by Guillaume Chevalier

Using Neuraxle

使用神经轴

Working with pipelines is simpler using Neuraxle. For instance, you can do this:

使用Neuraxle可以更简单地使用管道。例如,您可以这样做:

from neuraxle.pipeline import Pipeline

# Create and fit the pipeline: 
pipeline = Pipeline([
    StandardScaler(),
    PCA(n_components=2)
])
pipeline, X_t = pipeline.fit_transform(X)

# Get the components: 
pca = pipeline[-1]
components = pca.components_

You can access your PCA these three different ways as wished:

您可以根据需要通过这三种不同的方式访问您的 PCA:

  • pipeline['PCA']
  • pipeline[-1]
  • pipeline[1]
  • pipeline['PCA']
  • pipeline[-1]
  • pipeline[1]

Neuraxleis a pipelining library built on top of scikit-learnto take pipelines to the next level. It allows easily managing spaces of hyperparameter distributions, nested pipelines, saving and reloading, REST API serving, and more. The whole thing is made to also use Deep Learning algorithms and to allow parallel computing.

Neuraxle是一个建立在scikit-learn之上的流水线库,用于将流水线提升到一个新的水平。它允许轻松管理超参数分布空间、嵌套管道、保存和重新加载、REST API 服务等。整个过程也使用深度学习算法并允许并行计算。

Nested pipelines:

嵌套管道:

You could have pipelines within pipelines as below.

您可以在管道内设置管道,如下所示。

# Create and fit the pipeline: 
pipeline = Pipeline([
    StandardScaler(),
    Identity(),
    Pipeline([
        Identity(),  # Note: an Identity step is a step that does nothing. 
        Identity(),  # We use it here for demonstration purposes. 
        Identity(),
        Pipeline([
            Identity(),
            PCA(n_components=2)
        ])
    ])
])
pipeline, X_t = pipeline.fit_transform(X)

Then you'd need to do this:

那么你需要这样做:

# Get the components: 
pca = pipeline["Pipeline"]["Pipeline"][-1]
components = pca.components_