Python 如何在 NumPy 中规范化数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21030391/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to normalize an array in NumPy?
提问by Donbeo
I would like to have the norm of one NumPy array. More specifically, I am looking for an equivalent version of this function
我想要一个 NumPy 数组的规范。更具体地说,我正在寻找此功能的等效版本
def normalize(v):
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
Is there something like that in skearnor numpy?
skearn或 中有类似的东西numpy吗?
This function works in a situation where vis the 0 vector.
此函数适用于v0 向量的情况。
采纳答案by ali_m
If you're using scikit-learn you can use sklearn.preprocessing.normalize:
如果您使用 scikit-learn,您可以使用sklearn.preprocessing.normalize:
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = normalize(x[:,np.newaxis], axis=0).ravel()
print np.all(norm1 == norm2)
# True
回答by Eelco Hoogendoorn
I would agree that it were nice if such a function was part of the included batteries. But it isn't, as far as I know. Here is a version for arbitrary axes, and giving optimal performance.
我同意如果这样的功能是随附电池的一部分就好了。但据我所知,事实并非如此。这是任意轴的版本,并提供最佳性能。
import numpy as np
def normalized(a, axis=-1, order=2):
l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
l2[l2==0] = 1
return a / np.expand_dims(l2, axis)
A = np.random.randn(3,3,3)
print(normalized(A,0))
print(normalized(A,1))
print(normalized(A,2))
print(normalized(np.arange(3)[:,None]))
print(normalized(np.arange(3)))
回答by Eduard Feicho
You can specify ord to get the L1 norm. To avoid zero division I use eps, but that's maybe not great.
您可以指定 ord 来获得 L1 范数。为了避免零除法,我使用 eps,但这可能不是很好。
def normalize(v):
norm=np.linalg.norm(v, ord=1)
if norm==0:
norm=np.finfo(v.dtype).eps
return v/norm
回答by Joe
There is also the function unit_vector()to normalize vectors in the popular transformationsmodule by Christoph Gohlke:
Christoph Gohlkeunit_vector()在流行的转换模块中也有标准化向量的函数:
import transformations as trafo
import numpy as np
data = np.array([[1.0, 1.0, 0.0],
[1.0, 1.0, 1.0],
[1.0, 2.0, 3.0]])
print(trafo.unit_vector(data, axis=1))
回答by Jaden Travnik
If you have multidimensional data and want each axis normalized to its max or its sum:
如果您有多维数据并希望每个轴归一化为其最大值或总和:
def normalize(_d, to_sum=True, copy=True):
# d is a (n x dimension) np array
d = _d if not copy else np.copy(_d)
d -= np.min(d, axis=0)
d /= (np.sum(d, axis=0) if to_sum else np.ptp(d, axis=0))
return d
Uses numpys peak to peakfunction.
使用 numpys峰峰值函数。
a = np.random.random((5, 3))
b = normalize(a, copy=False)
b.sum(axis=0) # array([1., 1., 1.]), the rows sum to 1
c = normalize(a, to_sum=False, copy=False)
c.max(axis=0) # array([1., 1., 1.]), the max of each row is 1
回答by mrk
This might also work for you
这也可能对你有用
import numpy as np
normalized_v = v / np.sqrt(np.sum(v**2))
but fails when vhas length 0.
但在v长度为 0时失败。
回答by max0r
If you want to normalize n dimensional feature vectors stored in a 3D tensor, you could also use PyTorch:
如果要对存储在 3D 张量中的 n 维特征向量进行归一化,也可以使用 PyTorch:
import numpy as np
from torch import FloatTensor
from torch.nn.functional import normalize
vecs = np.random.rand(3, 16, 16, 16)
norm_vecs = normalize(FloatTensor(vecs), dim=0, eps=1e-16).numpy()
回答by paulmelnikow
If you're working with 3D vectors, you can do this concisely using the toolbelt vg. It's a light layer on top of numpy and it supports single values and stacked vectors.
如果您正在使用 3D 矢量,您可以使用工具带vg简洁地完成此操作。它是 numpy 之上的一个轻层,它支持单值和堆叠向量。
import numpy as np
import vg
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = vg.normalize(x)
print np.all(norm1 == norm2)
# True
I created the library at my last startup, where it was motivated by uses like this: simple ideas which are way too verbose in NumPy.
我在上次创业时创建了这个库,它的动机是这样的:简单的想法在 NumPy 中过于冗长。
回答by WY Hsu
You mentioned sci-kit learn, so I want to share another solution.
你提到了sci-kit学习,所以我想分享另一个解决方案。
sci-kit learn MinMaxScaler
sci-kit 学习 MinMaxScaler
In sci-kit learn, there is a API called MinMaxScalerwhich can customize the the value range as you like.
在 sci-kit learn 中,有一个 API 调用MinMaxScaler,可以根据需要自定义取值范围。
It also deal with NaN issues for us.
它还为我们处理 NaN 问题。
NaNs are treated as missing values: disregarded in fit, and maintained in transform. ... see reference [1]
NaN 被视为缺失值:在拟合中被忽略,并在转换中保持不变。... 见参考文献 [1]
Code sample
代码示例
The code is simple, just type
代码很简单,输入即可
# Let's say X_train is your input dataframe
from sklearn.preprocessing import MinMaxScaler
# call MinMaxScaler object
min_max_scaler = MinMaxScaler()
# feed in a numpy array
X_train_norm = min_max_scaler.fit_transform(X_train.values)
# wrap it up if you need a dataframe
df = pd.DataFrame(X_train_norm)
参考
回答by sergio verduzco
If you don't need utmost precision, your function can be reduced to:
如果您不需要最高的精度,您的功能可以简化为:
v_norm = v / (np.linalg.norm(v) + 1e-16)

