Python 使用 sklearn 的因子载荷

Question

提问by Riyaz

I want the correlations between individual variables and principal components in python. I am using PCA in sklearn. I don't understand how can I achieve the loading matrix after I have decomposed my data? My code is here.

我想要python中各个变量和主成分之间的相关性。我在 sklearn 中使用 PCA。我不明白分解数据后如何实现加载矩阵？我的代码在这里。

iris = load_iris()
data, y = iris.data, iris.target
pca = PCA(n_components=2)
transformed_data = pca.fit(data).transform(data)
eigenValues = pca.explained_variance_ratio_

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.htmldoesn't mention how this can be achieved.

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html没有提到如何实现。

Answer 1

采纳答案by Brad Solomon

I think that @RickardSjogren is describing the eigenvectors, while @BigPanda is giving the loadings. There's a big difference: Loadings vs eigenvectors in PCA: when to use one or another?.

我认为@RickardSjogren 是在描述特征向量，而@BigPanda 是在给出载荷。有很大的不同：PCA 中的加载与特征向量：何时使用一个或另一个？.

I created this PCA classwith a loadingsmethod.

我用一个方法创建了这个 PCA 类loadings。

Loadings, as given by pca.components_ * np.sqrt(pca.explained_variance_), are more analogous to coefficients in a multiple linear regression. I don't use .There because in the PCA class linked above, the components are already transposed. numpy.linalg.svdproduces u, s, and vt, where vtis the Hermetian transpose, so you first need to back into vwith vt.T.

由给出的载荷pca.components_ * np.sqrt(pca.explained_variance_)更类似于多元线性回归中的系数。我不在.T这里使用，因为在上面链接的 PCA 类中，组件已经转置了。 numpy.linalg.svd产生u, s, and vt，vt赫尔墨斯转置在哪里，所以你首先需要回到vwith vt.T。

There is also one other important detail: the signs (positive/negative) on the components and loadings in sklearn.PCAmay differ from packages such as R. More on that here:

还有一个重要的细节：组件上的符号（正/负）和加载sklearn.PCA可能与 R 等包不同。更多关于这里的信息：

In sklearn.decomposition.PCA, why are components_ negative?.

在 sklearn.decomposition.PCA 中，为什么 components_ 是负数？.

Answer 2

回答by RickardSjogren

According to this blogthe rows of pca.components_are the loading vectors. So:

根据此博客，行pca.components_是加载向量。所以：

loadings = pca.components_

Answer 3

回答by BigPanda

Multiply each component by the square root of its corresponding eigenvalue:

将每个分量乘以其相应特征值的平方根：

pca.components_.T * np.sqrt(pca.explained_variance_)

This should produce your loading matrix.

这应该会生成您的加载矩阵。

Python 使用 sklearn 的因子载荷

提问by Riyaz

采纳答案by Brad Solomon

回答by RickardSjogren

回答by BigPanda

相关推荐

最近更新

标签

Python 使用 sklearn 的因子载荷

提问by Riyaz

采纳答案by Brad Solomon

回答by RickardSjogren

回答by BigPanda

相关推荐

Python Django 休息框架 api_view 与普通视图

Python 新手 - PIP / 无效语法错误

如何在 Windows 上为 Python 使用协议缓冲区？

无法从 python 漂亮地打印 json

相关推荐

最近更新

标签