Python 如何在 NumPy 中规范化数组？

Question

提问by Donbeo

I would like to have the norm of one NumPy array. More specifically, I am looking for an equivalent version of this function

我想要一个 NumPy 数组的规范。更具体地说，我正在寻找此功能的等效版本

def normalize(v):
    norm = np.linalg.norm(v)
    if norm == 0: 
       return v
    return v / norm

Is there something like that in skearnor numpy?

skearn或中有类似的东西numpy吗？

This function works in a situation where vis the 0 vector.

此函数适用于v0 向量的情况。

Answer 1

采纳答案by ali_m

If you're using scikit-learn you can use sklearn.preprocessing.normalize:

如果您使用 scikit-learn，您可以使用sklearn.preprocessing.normalize：

import numpy as np
from sklearn.preprocessing import normalize

x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = normalize(x[:,np.newaxis], axis=0).ravel()
print np.all(norm1 == norm2)
# True

Answer 2

回答by Eelco Hoogendoorn

I would agree that it were nice if such a function was part of the included batteries. But it isn't, as far as I know. Here is a version for arbitrary axes, and giving optimal performance.

我同意如果这样的功能是随附电池的一部分就好了。但据我所知，事实并非如此。这是任意轴的版本，并提供最佳性能。

import numpy as np

def normalized(a, axis=-1, order=2):
    l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
    l2[l2==0] = 1
    return a / np.expand_dims(l2, axis)

A = np.random.randn(3,3,3)
print(normalized(A,0))
print(normalized(A,1))
print(normalized(A,2))

print(normalized(np.arange(3)[:,None]))
print(normalized(np.arange(3)))

Answer 3

回答by Eduard Feicho

You can specify ord to get the L1 norm. To avoid zero division I use eps, but that's maybe not great.

您可以指定 ord 来获得 L1 范数。为了避免零除法，我使用 eps，但这可能不是很好。

def normalize(v):
    norm=np.linalg.norm(v, ord=1)
    if norm==0:
        norm=np.finfo(v.dtype).eps
    return v/norm

Answer 4

回答by Joe

There is also the function unit_vector()to normalize vectors in the popular transformationsmodule by Christoph Gohlke:

Christoph Gohlkeunit_vector()在流行的转换模块中也有标准化向量的函数：

import transformations as trafo
import numpy as np

data = np.array([[1.0, 1.0, 0.0],
                 [1.0, 1.0, 1.0],
                 [1.0, 2.0, 3.0]])

print(trafo.unit_vector(data, axis=1))

Answer 5

回答by Jaden Travnik

If you have multidimensional data and want each axis normalized to its max or its sum:

如果您有多维数据并希望每个轴归一化为其最大值或总和：

def normalize(_d, to_sum=True, copy=True):
    # d is a (n x dimension) np array
    d = _d if not copy else np.copy(_d)
    d -= np.min(d, axis=0)
    d /= (np.sum(d, axis=0) if to_sum else np.ptp(d, axis=0))
    return d

Uses numpys peak to peakfunction.

使用 numpys峰峰值函数。

a = np.random.random((5, 3))

b = normalize(a, copy=False)
b.sum(axis=0) # array([1., 1., 1.]), the rows sum to 1

c = normalize(a, to_sum=False, copy=False)
c.max(axis=0) # array([1., 1., 1.]), the max of each row is 1

Answer 6

回答by mrk

This might also work for you

这也可能对你有用

import numpy as np
normalized_v = v / np.sqrt(np.sum(v**2))

but fails when vhas length 0.

但在v长度为 0时失败。

Answer 7

回答by max0r

If you want to normalize n dimensional feature vectors stored in a 3D tensor, you could also use PyTorch:

如果要对存储在 3D 张量中的 n 维特征向量进行归一化，也可以使用 PyTorch：

import numpy as np
from torch import FloatTensor
from torch.nn.functional import normalize

vecs = np.random.rand(3, 16, 16, 16)
norm_vecs = normalize(FloatTensor(vecs), dim=0, eps=1e-16).numpy()

Answer 8

回答by paulmelnikow

If you're working with 3D vectors, you can do this concisely using the toolbelt vg. It's a light layer on top of numpy and it supports single values and stacked vectors.

如果您正在使用 3D 矢量，您可以使用工具带vg简洁地完成此操作。它是 numpy 之上的一个轻层，它支持单值和堆叠向量。

import numpy as np
import vg

x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = vg.normalize(x)
print np.all(norm1 == norm2)
# True

I created the library at my last startup, where it was motivated by uses like this: simple ideas which are way too verbose in NumPy.

我在上次创业时创建了这个库，它的动机是这样的：简单的想法在 NumPy 中过于冗长。

Answer 9

回答by WY Hsu

You mentioned sci-kit learn, so I want to share another solution.

你提到了sci-kit学习，所以我想分享另一个解决方案。

sci-kit learn `MinMaxScaler`

sci-kit 学习 `MinMaxScaler`

In sci-kit learn, there is a API called MinMaxScalerwhich can customize the the value range as you like.

在 sci-kit learn 中，有一个 API 调用MinMaxScaler，可以根据需要自定义取值范围。

It also deal with NaN issues for us.

它还为我们处理 NaN 问题。

NaNs are treated as missing values: disregarded in fit, and maintained in transform. ... see reference [1]

NaN 被视为缺失值：在拟合中被忽略，并在转换中保持不变。... 见参考文献 [1]

Code sample

代码示例

The code is simple, just type

代码很简单，输入即可

# Let's say X_train is your input dataframe
from sklearn.preprocessing import MinMaxScaler
# call MinMaxScaler object
min_max_scaler = MinMaxScaler()
# feed in a numpy array
X_train_norm = min_max_scaler.fit_transform(X_train.values)
# wrap it up if you need a dataframe
df = pd.DataFrame(X_train_norm)

参考

[1] sklearn.preprocessing.MinMaxScaler

[1] sklearn.preprocessing.MinMaxScaler

Answer 10

回答by sergio verduzco

If you don't need utmost precision, your function can be reduced to:

如果您不需要最高的精度，您的功能可以简化为：

v_norm = v / (np.linalg.norm(v) + 1e-16)

Python 如何在 NumPy 中规范化数组？

提问by Donbeo

采纳答案by ali_m

回答by Eelco Hoogendoorn

回答by Eduard Feicho

回答by Joe

回答by Jaden Travnik

回答by mrk

回答by max0r

回答by paulmelnikow

回答by WY Hsu

sci-kit learn `MinMaxScaler`

sci-kit 学习 `MinMaxScaler`

Code sample

代码示例

回答by sergio verduzco

相关推荐

最近更新

标签

Python 如何在 NumPy 中规范化数组？

提问by Donbeo

采纳答案by ali_m

回答by Eelco Hoogendoorn

回答by Eduard Feicho

回答by Joe

回答by Jaden Travnik

回答by mrk

回答by max0r

回答by paulmelnikow

回答by WY Hsu

sci-kit learn MinMaxScaler

sci-kit 学习 MinMaxScaler

Code sample

代码示例

回答by sergio verduzco

相关推荐

将目录添加到 sys.path /PYTHONPATH

Python随机生成的IP地址作为字符串

Python “列表”对象没有“形状”属性

Python 多个窗口的 Tkinter 示例代码，为什么按钮不能正确加载？

相关推荐

最近更新

标签

sci-kit learn `MinMaxScaler`

sci-kit 学习 `MinMaxScaler`