Python 使用 sklearn 的因子载荷
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21217710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Factor Loadings using sklearn
提问by Riyaz
I want the correlations between individual variables and principal components in python. I am using PCA in sklearn. I don't understand how can I achieve the loading matrix after I have decomposed my data? My code is here.
我想要python中各个变量和主成分之间的相关性。我在 sklearn 中使用 PCA。我不明白分解数据后如何实现加载矩阵?我的代码在这里。
iris = load_iris()
data, y = iris.data, iris.target
pca = PCA(n_components=2)
transformed_data = pca.fit(data).transform(data)
eigenValues = pca.explained_variance_ratio_
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.htmldoesn't mention how this can be achieved.
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html没有提到如何实现。
采纳答案by Brad Solomon
I think that @RickardSjogren is describing the eigenvectors, while @BigPanda is giving the loadings. There's a big difference: Loadings vs eigenvectors in PCA: when to use one or another?.
我认为@RickardSjogren 是在描述特征向量,而@BigPanda 是在给出载荷。有很大的不同:PCA 中的加载与特征向量:何时使用一个或另一个?.
I created this PCA classwith a loadingsmethod.
我用一个方法创建了这个 PCA 类loadings。
Loadings, as given by pca.components_ * np.sqrt(pca.explained_variance_), are more analogous to coefficients in a multiple linear regression. I don't use .There because in the PCA class linked above, the components are already transposed. numpy.linalg.svdproduces u, s, and vt, where vtis the Hermetian transpose, so you first need to back into vwith vt.T.
由 给出的载荷pca.components_ * np.sqrt(pca.explained_variance_)更类似于多元线性回归中的系数。我不在.T这里使用,因为在上面链接的 PCA 类中,组件已经转置了。 numpy.linalg.svd产生u, s, and vt,vt赫尔墨斯转置在哪里,所以你首先需要回到vwith vt.T。
There is also one other important detail: the signs (positive/negative) on the components and loadings in sklearn.PCAmay differ from packages such as R.
More on that here:
还有一个重要的细节:组件上的符号(正/负)和加载sklearn.PCA可能与 R 等包不同。更多关于这里的信息:
In sklearn.decomposition.PCA, why are components_ negative?.
回答by RickardSjogren
回答by BigPanda
Multiply each component by the square root of its corresponding eigenvalue:
将每个分量乘以其相应特征值的平方根:
pca.components_.T * np.sqrt(pca.explained_variance_)
This should produce your loading matrix.
这应该会生成您的加载矩阵。

