最近邻搜索:Python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12923586/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Nearest Neighbor Search: Python
提问by Dlinet
I have a 2 dimensional array:
我有一个二维数组:
MyArray = array([6588252.24, 1933573.3, 212.79, 0, 0],
[6588253.79, 1933602.89, 212.66, 0, 0],
etc...)
The first two elements MyArray[0]and MyArray[1]are the Xand Ycoordinates of the points.
前两个元素MyArray[0]和MyArray[1]是点的X和Y坐标。
For every element in the array, I would like to find the quickestway to return its single nearest neighbor in a radius of Xunits. We are assuming this is in 2D space.
对于数组中的每个元素,我想找到以X单位为半径返回其单个最近邻居的最快方法。我们假设这是在 2D 空间中。
lets say for this example X = 6.
让我们说这个例子X = 6。
I have solved the problem by comparing every element to every other element, but this takes 15 minutes or so when your list is 22k points long. We hope to eventually run this on lists of about 30million points.
我已经通过将每个元素与每个其他元素进行比较来解决这个问题,但是当您的列表长度为 22k 点时,这需要 15 分钟左右。我们希望最终在大约 3000 万个点的列表上运行它。
I have read about K-d trees and understand the basic concept, but have had trouble understanding how to script them.
我已经阅读了 Kd 树并理解了基本概念,但是在理解如何编写它们的脚本时遇到了麻烦。
采纳答案by Dlinet
Thanks to John Vinyard for suggesting scipy. After some good research and testing, here is the solution to this question:
感谢 John Vinyard 建议 scipy。经过一些良好的研究和测试,这里是这个问题的解决方案:
Prerequisites: Install Numpy and SciPy
先决条件:安装 Numpy 和 SciPy
Import the SciPy and Numpy Modules
Make a copy of the 5 dimensional array including justthe X and Y values.
Create an instance of a
cKDTreeas such:YourTreeName = scipy.spatial.cKDTree(YourArray, leafsize=100) #Play with the leafsize to get the fastest result for your datasetQuery the
cKDTreefor the Nearest Neighbor within 6 units as such:for item in YourArray: TheResult = YourTreeName.query(item, k=1, distance_upper_bound=6)for each item in
YourArray,TheResultwill be a tuple of the distance between the two points, and the index of the location of the point inYourArray.
导入 SciPy 和 Numpy 模块
制作仅包含 X 和 Y 值的 5 维数组的副本。
创建一个 a 的实例,
cKDTree如下所示:YourTreeName = scipy.spatial.cKDTree(YourArray, leafsize=100) #Play with the leafsize to get the fastest result for your dataset查询
cKDTree6 个单位内的最近邻居,如下所示:for item in YourArray: TheResult = YourTreeName.query(item, k=1, distance_upper_bound=6)对于 中的每个项目
YourArray,TheResult将是两点之间距离的元组,以及 中点位置的索引YourArray。

