Python NumPy 数组的最小-最大归一化
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48178884/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Min-max normalisation of a NumPy array
提问by mbilyanov
I have the following numpy array:
我有以下 numpy 数组:
foo = np.array([[0.0, 10.0], [0.13216, 12.11837], [0.25379, 42.05027], [0.30874, 13.11784]])
which yields:
产生:
[[ 0. 10. ]
[ 0.13216 12.11837]
[ 0.25379 42.05027]
[ 0.30874 13.11784]]
How can I normalize the Y component of this array. So it gives me something like:
我怎样才能规范化这个数组的 Y 分量。所以它给了我类似的东西:
[[ 0. 0. ]
[ 0.13216 0.06 ]
[ 0.25379 1 ]
[ 0.30874 0.097]]
回答by cs95
Referring to this Cross Validated Link, How to normalize data to 0-1 range?, it looks like you can perform min-max normalisation on the last column of foo
.
参考此交叉验证链接,如何将数据规范化为 0-1 范围?,看起来您可以对 的最后一列执行最小-最大归一化foo
。
v = foo[:, 1] # foo[:, -1] for the last column
foo[:, 1] = (v - v.min()) / (v.max() - v.min())
foo
array([[ 0. , 0. ],
[ 0.13216 , 0.06609523],
[ 0.25379 , 1. ],
[ 0.30874 , 0.09727968]])
Another option for performing normalisation (as suggested by OP) is using sklearn.preprocessing.normalize
, which yields slightly different results -
执行规范化的另一种选择(如 OP 所建议的)是使用sklearn.preprocessing.normalize
,它会产生略有不同的结果 -
from sklearn.preprocessing import normalize
foo[:, [-1]] = normalize(foo[:, -1, None], norm='max', axis=0)
foo
array([[ 0. , 0.2378106 ],
[ 0.13216 , 0.28818769],
[ 0.25379 , 1. ],
[ 0.30874 , 0.31195614]])
回答by rnso
sklearn.preprocessing.MinMaxScalercan also be used (feature_range=(0, 1)
is default):
sklearn.preprocessing.MinMaxScaler也可以使用(feature_range=(0, 1)
默认):
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
v = foo[:,1]
v_scaled = min_max_scaler.fit_transform(v)
foo[:,1] = v_scaled
print(foo)
Output:
输出:
[[ 0. 0. ]
[ 0.13216 0.06609523]
[ 0.25379 1. ]
[ 0.30874 0.09727968]]
Advantage is that scaling to any range can be done.
优点是可以缩放到任何范围。
回答by James
I think you want this:
我想你想要这个:
foo[:,1] = (foo[:,1] - foo[:,1].min()) / (foo[:,1].max() - foo[:,1].min())
回答by yellow01
You are trying to min-max scale between 0 and 1 only the second column.
您正在尝试在 0 和 1 之间仅对第二列进行最小-最大缩放。
Using sklearn.preprocessing.minmax_scale
, should easily solve your problem.
使用sklearn.preprocessing.minmax_scale
, 应该可以轻松解决您的问题。
e.g.:
例如:
from sklearn.preprocessing import minmax_scale
column_1 = foo[:,0] #first column you don't want to scale
column_2 = minmax_scale(foo[:,1], feature_range=(0,1)) #second column you want to scale
foo_norm = np.stack((column_1, column_2), axis=1) #stack both columns to get a 2d array
Should yield
应该屈服
array([[0. , 0. ],
[0.13216 , 0.06609523],
[0.25379 , 1. ],
[0.30874 , 0.09727968]])
Maybe you want to min-max scale between 0 and 1 both columns. In this case, use:
也许您想在 0 和 1 之间对两列进行最小-最大缩放。在这种情况下,请使用:
foo_norm = minmax_scale(foo, feature_range=(0,1), axis=0)
Which yields
哪个产量
array([[0. , 0. ],
[0.42806245, 0.06609523],
[0.82201853, 1. ],
[1. , 0.09727968]])
note: Not to be confused with the operation that scales the norm(length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.
注意:不要与将向量的范数(长度)缩放到某个值(通常为 1)的操作混淆,这通常也称为归一化。