在 Pandas 数据框中查找从点到行的欧几里德距离
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46908388/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find euclidean distance from a point to rows in pandas dataframe
提问by Shubham R
i have a dataframe
我有一个数据框
id lat long
1 12.654 15.50
2 14.364 25.51
3 17.636 32.53
5 12.334 25.84
9 32.224 15.74
I want to find the euclidean distance of these coordinates from a particulat location saved in a list L1
我想从列表 L1 中保存的特定位置找到这些坐标的欧几里德距离
L1 = [11.344,7.234]
i want to create a new column in df where i have the distances
我想在 df 中创建一个新列,其中我有距离
id lat long distance
1 12.654 15.50
2 14.364 25.51
3 17.636 32.53
5 12.334 25.84
9 32.224 15.74
i know to find euclidean distance between two points using math.hypot():
我知道使用 math.hypot() 找到两点之间的欧几里得距离:
dist = math.hypot(x2 - x1, y2 - y1)
How do i write a function using apply or iterate over rows to give me distances.
我如何使用应用或迭代行来编写函数以给出距离。
回答by Zero
Use vectorized approach
使用矢量化方法
In [5463]: (df[['lat', 'long']] - np.array(L1)).pow(2).sum(1).pow(0.5)
Out[5463]:
0 8.369161
1 18.523838
2 26.066777
3 18.632320
4 22.546096
dtype: float64
Which can also be
这也可以
In [5468]: df['distance'] = df[['lat', 'long']].sub(np.array(L1)).pow(2).sum(1).pow(0.5)
In [5469]: df
Out[5469]:
id lat long distance
0 1 12.654 15.50 8.369161
1 2 14.364 25.51 18.523838
2 3 17.636 32.53 26.066777
3 5 12.334 25.84 18.632320
4 9 32.224 15.74 22.546096
Option 2Use Numpy's built-in np.linalg.norm
vector norm.
选项 2使用 Numpy 的内置np.linalg.norm
向量范数。
In [5473]: np.linalg.norm(df[['lat', 'long']].sub(np.array(L1)), axis=1)
Out[5473]: array([ 8.36916101, 18.52383805, 26.06677732, 18.63231966, 22.5460958 ])
In [5485]: df['distance'] = np.linalg.norm(df[['lat', 'long']].sub(np.array(L1)), axis=1)
回答by cs95
Translating [(x2- x1)2+ (y2- y1)2]1/2into pandas vectorised operations, you have:
将 [(x 2- x 1) 2+ (y 2- y 1) 2] 1/2转换为Pandas矢量化操作,您有:
df['distance'] = (df.lat.sub(11.344).pow(2).add(df.long.sub(7.234).pow(2))).pow(.5)
df
lat long distance
id
1 12.654 15.50 8.369161
2 14.364 25.51 18.523838
3 17.636 32.53 26.066777
5 12.334 25.84 18.632320
9 32.224 15.74 22.546096
Alternatively, using arithmetic operators:
或者,使用算术运算符:
(((df.lat - 11.344) ** 2) + (df.long - 7.234) ** 2) ** .5