Python sklearn:如何获得多项式特征的系数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31290976/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:46:19  来源:igfitidea点击:

sklearn: how to get coefficients of polynomial features

pythonscikit-learn

提问by Moritz

I know it is possible to obtain the polynomial features as numbers by using: polynomial_features.transform(X). According to the manual, for a degree of two the features are: [1, a, b, a^2, ab, b^2]. But how do I obtain a description of the features for higher orders ? .get_params()does not show any list of features.

我知道这是可能通过使用获得多项式特征号:polynomial_features.transform(X)。根据手册,对于两个程度的特征是:[1, a, b, a^2, ab, b^2]。但是我如何获得高阶特征的描述?.get_params()不显示任何功能列表。

采纳答案by prez

By the way, there is more appropriate function now: PolynomialFeatures.get_feature_names.

顺便说一句,现在有更合适的函数: PolynomialFeatures.get_feature_names

from sklearn.preprocessing import PolynomialFeatures
import pandas as pd
import numpy as np

data = pd.DataFrame.from_dict({
    'x': np.random.randint(low=1, high=10, size=5),
    'y': np.random.randint(low=-1, high=1, size=5),
})

p = PolynomialFeatures(degree=2).fit(data)
print p.get_feature_names(data.columns)

This will output as follows:

这将输出如下:

['1', 'x', 'y', 'x^2', 'x y', 'y^2']

N.B. For some reason you gotta fit your PolynomialFeatures object before you will be able to use get_feature_names().

注意,出于某种原因,您必须先拟合 PolynomialFeatures 对象,然后才能使用 get_feature_names()。

If you are Pandas-lover (as I am), you can easily form DataFrame with all new features like this:

如果您是 Pandas 爱好者(就像我一样),您可以轻松地使用所有新功能形成 DataFrame,如下所示:

features = DataFrame(p.transform(data), columns=p.get_feature_names(data.columns))
print features

Result will look like this:

结果将如下所示:

     1    x    y   x^2  x y  y^2
0  1.0  8.0 -1.0  64.0 -8.0  1.0
1  1.0  9.0 -1.0  81.0 -9.0  1.0
2  1.0  1.0  0.0  1.0   0.0  0.0
3  1.0  6.0  0.0  36.0  0.0  0.0
4  1.0  5.0 -1.0  25.0 -5.0  1.0

回答by omerbp

import numpy as np
from sklearn.preprocessing import PolynomialFeatures

X = np.array([2,3])

poly = PolynomialFeatures(3)
Y = poly.fit_transform(X)
print Y
# prints [[ 1  2  3  4  6  9  8 12 18 27]]
print poly.powers_

This code will print:

此代码将打印:

[[0 0]
 [1 0]
 [0 1]
 [2 0]
 [1 1]
 [0 2]
 [3 0]
 [2 1]
 [1 2]
 [0 3]]

So if the i'th cell is (x,y), that means that Y[i]=(a**x)*(b**y). For instance, in the code example [2 1]equals to (2**2)*(3**1)=12.

所以如果第 i 个单元格是(x,y),那就意味着Y[i]=(a**x)*(b**y). 例如,在代码示例中[2 1]等于(2**2)*(3**1)=12