Python 归一化以引入 [0,1] 的范围

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18380419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:36:24  来源:igfitidea点击:

Normalization to bring in the range of [0,1]

python

提问by pypro

I have a huge data set from which I derive two sets of datapoints, which I then have to plot and compare. These two plots differ in their in their range, so I want them to be in the range of [0,1]. For the following code and a specific data set I get a constant line at 1 as the dataset plot, but this normalization works well for other sets:

我有一个巨大的数据集,从中可以得出两组数据点,然后我必须绘制和比较这些数据点。这两个图的范围不同,所以我希望它们在 [0,1] 的范围内。对于以下代码和特定数据集,我在 1 处得到一条恒定线作为数据集图,但这种归一化适用于其他数据集:

plt.plot(range(len(rvalue)),np.array(rvalue)/(max(rvalue)))

and for this code :

对于此代码:

oldrange = max(rvalue)-min(rvalue) #NORMALIZING
newmin=0
newrange = 1 + 0.9999999999 - newmin
normal = map(lambda x, r=float(rvalue[-1] - rvalue[0]): ((x - rvalue[0]) / r)*1 - 0, rvalue)
plt.plot(range(len(rvalue)),normal)

I get the error:

我收到错误:

ZeroDivisionError: float division by zero

for all the data sets. I am unable to figure out how to get both the plots in one range for comparison.

对于所有数据集。我无法弄清楚如何将两个图都放在一个范围内进行比较。

回答by Brionius

I tried to simplify things a little. Try this:

我试图简化一些事情。尝试这个:

oldmin = min(rvalue)
oldmax = max(rvalue)
oldrange = oldmax - oldmin
newmin = 0.
newmax = 1.
newrange = newmax - newmin
if oldrange == 0:            # Deal with the case where rvalue is constant:
    if oldmin < newmin:      # If rvalue < newmin, set all rvalue values to newmin
        newval = newmin
    elif oldmin > newmax:    # If rvalue > newmax, set all rvalue values to newmax
        newval = newmax
    else:                    # If newmin <= rvalue <= newmax, keep rvalue the same
        newval = oldmin
    normal = [newval for v in rvalue]
else:
    scale = newrange / oldrange
    normal = [(v - oldmin) * scale + newmin for v in rvalue]

plt.plot(range(len(rvalue)),normal)

The only reason I can see for the ZeroDivisionErroris if the data in rvalue were constant (all values are the same). Is that the case?

我能看到的唯一原因ZeroDivisionError是右值中的数据是否恒定(所有值都相同)。是这样吗?

回答by CT Zhu

Finding the range of an array is provided by numpybuild-in function numpy.ptp(), your question can be addresses by:

查找数组的范围由numpy内置函数提供numpy.ptp(),您的问题可以通过以下方式解决:

#First we should filter input_array so that it does not contain NaN or Inf.
input_array=np.array(some_data)
if np.unique(input_array).shape[0]==1:
    pass #do thing if the input_array is constant
else:
    result_array=(input_array-np.min(input_array))/np.ptp(input_array)
#To extend it to higher dimension, add axis= kwarvg to np.min and np.ptp

回答by Marissa Novak

Use scikit: http://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-range

使用 scikit:http://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-range

It has built in functions to scale features to a specified range. You'll find other functions to normalize and standardize here.

它具有将特征缩放到指定范围的内置函数。您会在此处找到其他用于规范化和标准化的函数。

See this example:

看这个例子:

>>> X_train = np.array([[ 1., -1.,  2.],
...                     [ 2.,  0.,  0.],
...                     [ 0.,  1., -1.]])
...
>>> min_max_scaler = preprocessing.MinMaxScaler()
>>> X_train_minmax = min_max_scaler.fit_transform(X_train)
>>> X_train_minmax
array([[ 0.5       ,  0.        ,  1.        ],
       [ 1.        ,  0.5       ,  0.33333333],
       [ 0.        ,  1.        ,  0.        ]])

回答by user3284005

Use the following method to normalize your data in the range of 0 to 1 using min and max value from the data sequence:

使用以下方法使用数据序列中的最小值和最大值在 0 到 1 的范围内标准化您的数据:

import numpy as np

def NormalizeData(data):
    return (data - np.min(data)) / (np.max(data) - np.min(data))

回答by Jay Dangar

A simple way to normalize anything between 0 and 1 is just divide all the values by max value, from the all values. Will bring values between range of 0 to 1.

将 0 和 1 之间的任何值归一化的一种简单方法是将所有值除以最大值,即所有值。将带来 0 到 1 范围内的值。

回答by R Zhang

scikit_learn has a function for this
sklearn.preprocessing.minmax_scale(X, feature_range=(0, 1), axis=0, copy=True)

scikit_learn 有一个功能
sklearn.preprocessing.minmax_scale(X, feature_range=(0, 1), axis=0, copy=True)

More convenient than using the Class MinMaxScale.

比使用类 MinMaxScale 更方便。

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.minmax_scale.html#sklearn.preprocessing.minmax_scale

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.minmax_scale.html#sklearn.preprocessing.minmax_scale