理解 IPython 中的 numpy.linalg.norm()

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22027767/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:06:49  来源:igfitidea点击:

Understanding numpy.linalg.norm() in IPython

pythonarraysnumpyipython

提问by user2635779

I'm creating a linear regression model for supervised learning.

我正在为监督学习创建一个线性回归模型。

I have a bunch of data points plotted on a graph (x1, y1), (x2, y2), (x3, y3), etc, where the x's are the real data and the y values are the training data values.

我在图 (x1, y1)、(x2, y2)、(x3, y3) 等上绘制了一堆数据点,其中 x 是真实数据,y 值是训练数据值。

As part of the next step in writing a basic nearest neighbor algorithm, I want to create a distance metric to measure the distance (and similarity) between two instances.

作为编写基本最近邻算法的下一步的一部分,我想创建一个距离度量来测量两个实例之间的距离(和相似性)。

If I wanted to write a generic function to compute the L-Norm distance in ipython, I know that a lot of people use numpy.linalg.norm(arr, ord = , axis=). What I'm confused about is how to format my array of data points so that it properly calculates the L-norm values.

如果我想在 ipython 中编写一个通用函数来计算 L-Norm 距离,我知道很多人使用 numpy.linalg.norm(arr, ord = ,axis=)。我感到困惑的是如何格式化我的数据点数组,以便它正确计算 L 范数值。

If I had just two data points, say (3, 4) and (5, 9), would my array need to look like this with each data point's values in one row?

如果我只有两个数据点,比如 (3, 4) 和 (5, 9),我的数组是否需要像这样,每个数据点的值都在一行中?

arry = ([[3,4] 
         [5,9]])

or would it need to look like this where all the x-axis values are in one row and y in another?

或者它是否需要看起来像这样,其中所有 x 轴值都在一行中,而 y 在另一行中?

arry = ([[3,5]
         [4,9]])

回答by wflynny

numpy.linalg.norm(x) == numpy.linalg.norm(x.T)where .Tdenotes the transpose. So it doesn't matter.

numpy.linalg.norm(x) == numpy.linalg.norm(x.T)其中.T表示转置。所以没关系。

For example:

例如:

>>> import numpy as np
>>> x = np.random.rand(5000, 2)
>>> x.shape
(5000, 2)
>>> x.T.shape
(2, 5000)
>>> np.linalg.norm(x)
57.82467111195578
>>> np.linalg.norm(x.T)
57.82467111195578

Edit:

编辑:

Given that your vector is basically

鉴于您的向量基本上是

x = [[real_1, training_1],
     [real_2, training_2],
      ...
     [real_n, training_n]]

then the Frobenius norm is basically computing

那么 Frobenius 范数基本上就是计算

np.sqrt(np.sum(x**2))

Are you sure this is the right metric. There are a whole bunch of other norms. Here are 3

您确定这是正确的指标吗?还有一大堆其他规范。这里有 3

np.sum((x[:,0]**2 - x[:,1]**2) # N-dimensional euclidean norm
np.sqrt(np.sum(x[:,0]**2) + np.sum(x[:,1]**2)) # L^2 norm
np.sqrt(x[:,0].dot(x[:,1])) # sqrt dot product