Python 如何标准化矩阵?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4544292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 16:15:24  来源:igfitidea点击:

How do I standardize a matrix?

pythonalgorithmmathnumpy

提问by pnodbnda

Basically, take a matrix and change it so that its mean is equal to 0 and variance is 1. I'm using numpy's arrays so if it can already do it it's better, but I can implement it myself as long as I can find an algorithm.

基本上,取一个矩阵并更改它,使其均值等于 0,方差为 1。我正在使用 numpy 的数组,所以如果它已经可以做到它会更好,但我可以自己实现它,只要我能找到一个算法。

edit: nvm nimrodm has a better implementation

编辑:nvm nimrodm 有更好的实现

采纳答案by ja72

Take each element and subtract with the mean and then divide by the standard deviation.

取每个元素并减去均值,然后除以标准差。

Shoot me, I don't know python. In general the above is

拍我,我不懂蟒蛇。一般来说,以上是

mu = Average()
sig = StandardDeviation()
for(i=0;i<rows;i++)
{
   for(j=0;j<cols;j++)
   {
       A[i,j] = (A[i,j]-mu)/sig;
   }
}

回答by nimrodm

The following subtracts the mean of A from each element (the new mean is 0), then normalizes the result by the standard deviation.

下面从每个元素中减去 A 的平均值(新的平均值为 0),然后通过标准偏差对结果进行归一化。

from numpy import *
A = (A - mean(A)) / std(A)

The above is for standardizing the entire matrix as a whole, If A has many dimensions and you want to standardize each column individually, specify the axis:

以上是将整个矩阵作为一个整体进行标准化,如果 A 有很多维度,并且您想单独标准化每一列,请指定

from numpy import *
A = (A - mean(A, axis=0)) / std(A, axis=0)

Always verify by hand what these one-liners are doing before integrating them into your code. A simple change in orientation or dimension can drastically change (silently) what operations numpy performs on them.

在将它们集成到您的代码中之前,请始终手动验证这些单行代码在做什么。方向或维度的简单变化可以(悄悄地)彻底改变 numpy 对它们执行的操作。

回答by AmanRaj

import scipy.stats as ss

A = np.array(ss.zscore(A))

回答by DoesData

from sklearn.preprocessing import StandardScaler

standardized_data = StandardScaler().fit_transform(your_data)

Example:

例子:

>>> import numpy as np
>>> from sklearn.preprocessing import StandardScaler

>>> data = np.random.randint(25, size=(4, 4))
>>> data
array([[17, 12,  4, 17],
       [ 1, 16, 19,  1],
       [ 7,  8, 10,  4],
       [22,  4,  2,  8]])

>>> standardized_data = StandardScaler().fit_transform(data)
>>> standardized_data
array([[ 0.63812398,  0.4472136 , -0.718646  ,  1.57786412],
       [-1.30663482,  1.34164079,  1.55076242, -1.07959124],
       [-0.57735027, -0.4472136 ,  0.18911737, -0.58131836],
       [ 1.24586111, -1.34164079, -1.02123379,  0.08304548]])

Works well on large datasets.

在大型数据集上运行良好。

回答by Yuya Takashina

Use sklearn.preprocessing.scale.

使用sklearn.preprocessing.scale.

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html

Here is an example.

这是一个例子。

>>> from sklearn import preprocessing
>>> import numpy as np
>>> X_train = np.array([[ 1., -1.,  2.],
...                     [ 2.,  0.,  0.],
...                     [ 0.,  1., -1.]])
>>> X_scaled = preprocessing.scale(X_train)
>>> X_scaled
array([[ 0.  ..., -1.22...,  1.33...],
       [ 1.22...,  0.  ..., -0.26...],
       [-1.22...,  1.22..., -1.06...]])

http://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling

http://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling

回答by Alexander Drobyshevsky

import numpy as np

A = np.array([[1,2,6], [3000,1000,2000]]).T  

A_means = np.mean(A, axis=0)
A_centr = A - A_means
A_norms = np.linalg.norm(A_centr, axis=0)

A_std = A_centr / A_norms