理解 IPython 中的 numpy.linalg.norm()
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22027767/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Understanding numpy.linalg.norm() in IPython
提问by user2635779
I'm creating a linear regression model for supervised learning.
我正在为监督学习创建一个线性回归模型。
I have a bunch of data points plotted on a graph (x1, y1), (x2, y2), (x3, y3), etc, where the x's are the real data and the y values are the training data values.
我在图 (x1, y1)、(x2, y2)、(x3, y3) 等上绘制了一堆数据点,其中 x 是真实数据,y 值是训练数据值。
As part of the next step in writing a basic nearest neighbor algorithm, I want to create a distance metric to measure the distance (and similarity) between two instances.
作为编写基本最近邻算法的下一步的一部分,我想创建一个距离度量来测量两个实例之间的距离(和相似性)。
If I wanted to write a generic function to compute the L-Norm distance in ipython, I know that a lot of people use numpy.linalg.norm(arr, ord = , axis=). What I'm confused about is how to format my array of data points so that it properly calculates the L-norm values.
如果我想在 ipython 中编写一个通用函数来计算 L-Norm 距离,我知道很多人使用 numpy.linalg.norm(arr, ord = ,axis=)。我感到困惑的是如何格式化我的数据点数组,以便它正确计算 L 范数值。
If I had just two data points, say (3, 4) and (5, 9), would my array need to look like this with each data point's values in one row?
如果我只有两个数据点,比如 (3, 4) 和 (5, 9),我的数组是否需要像这样,每个数据点的值都在一行中?
arry = ([[3,4]
[5,9]])
or would it need to look like this where all the x-axis values are in one row and y in another?
或者它是否需要看起来像这样,其中所有 x 轴值都在一行中,而 y 在另一行中?
arry = ([[3,5]
[4,9]])
回答by wflynny
numpy.linalg.norm(x) == numpy.linalg.norm(x.T)where .Tdenotes the transpose. So it doesn't matter.
numpy.linalg.norm(x) == numpy.linalg.norm(x.T)其中.T表示转置。所以没关系。
For example:
例如:
>>> import numpy as np
>>> x = np.random.rand(5000, 2)
>>> x.shape
(5000, 2)
>>> x.T.shape
(2, 5000)
>>> np.linalg.norm(x)
57.82467111195578
>>> np.linalg.norm(x.T)
57.82467111195578
Edit:
编辑:
Given that your vector is basically
鉴于您的向量基本上是
x = [[real_1, training_1],
[real_2, training_2],
...
[real_n, training_n]]
then the Frobenius norm is basically computing
那么 Frobenius 范数基本上就是计算
np.sqrt(np.sum(x**2))
Are you sure this is the right metric. There are a whole bunch of other norms. Here are 3
您确定这是正确的指标吗?还有一大堆其他规范。这里有 3
np.sum((x[:,0]**2 - x[:,1]**2) # N-dimensional euclidean norm
np.sqrt(np.sum(x[:,0]**2) + np.sum(x[:,1]**2)) # L^2 norm
np.sqrt(x[:,0].dot(x[:,1])) # sqrt dot product

