Python 生成多个随机(x,y)坐标,不包括重复?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19668463/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Generating multiple random (x, y) coordinates, excluding duplicates?
提问by user2901745
I want to generate a bunch (x, y) coordinates from 0 to 2500 that excludes points that are within 200 of each other without recursion.
我想生成一组从 0 到 2500 的 (x, y) 坐标,不使用递归排除彼此 200 以内的点。
Right now I have it check through a list of all previous values to see if any are far enough from all the others. This is really inefficient and if I need to generate a large number of points it takes forever.
现在我让它检查所有先前值的列表,看看是否有任何值与所有其他值相距足够远。这确实效率低下,如果我需要生成大量点,则需要永远。
So how would I go about doing this?
那么我该怎么做呢?
回答by Hooked
I would overgenerate the points, target_N < input_N
, and filter them using a KDTree. For example:
我会过度生成点,target_N < input_N
并使用KDTree过滤它们。例如:
import numpy as np
from scipy.spatial import KDTree
N = 20
pts = 2500*np.random.random((N,2))
tree = KDTree(pts)
print tree.sparse_distance_matrix(tree, 200)
Would give me points that are "close" to each other. From here it should be simple to apply any filter:
会给我彼此“接近”的分数。从这里开始应用任何过滤器应该很简单:
(11, 0) 60.843426339
(0, 11) 60.843426339
(1, 3) 177.853472309
(3, 1) 177.853472309
回答by jwodder
This is a variant on Hank Ditton's suggestion that should be more efficient time- and memory-wise, especially if you're selecting relatively few points out of all possible points. The idea is that, whenever a new point is generated, everything within 200 units of it is added to a set of points to exclude, against which all freshly-generated points are checked.
这是汉克·迪顿 (Hank Ditton) 建议的变体,在时间和记忆方面应该更有效,尤其是当您从所有可能的点中选择相对较少的点时。这个想法是,每当生成一个新点时,它的 200 个单位内的所有内容都会添加到一组要排除的点中,并根据这些点检查所有新生成的点。
import random
radius = 200
rangeX = (0, 2500)
rangeY = (0, 2500)
qty = 100 # or however many points you want
# Generate a set of all points within 200 of the origin, to be used as offsets later
# There's probably a more efficient way to do this.
deltas = set()
for x in range(-radius, radius+1):
for y in range(-radius, radius+1):
if x*x + y*y <= radius*radius:
deltas.add((x,y))
randPoints = []
excluded = set()
i = 0
while i<qty:
x = random.randrange(*rangeX)
y = random.randrange(*rangeY)
if (x,y) in excluded: continue
randPoints.append((x,y))
i += 1
excluded.update((x+dx, y+dy) for (dx,dy) in deltas)
print randPoints
回答by damienfrancois
Some options:
一些选项:
- Use your algorithm but implement it with a kd-treethat would speed up nearest neighbours look-up
- Build a regular grid over the [0, 2500]^2 square and 'shake' all points randomly with a bi-dimensional normal distribution centered on each intersection in the grid
- Draw a larger number of random points then apply a k-meansalgorithm and only keep the centroids. They will be far away from one another and the algorithm, though iterative, could converge more quickly than your algorithm.
回答by aganders3
This has been answered, but it's very tangentially related to my work so I took a stab at it. I implemented the algorithm described in this notewhich I found linked from this blog post. Unfortunately it's not faster than the other proposed methods, but I'm sure there are optimizations to be made.
这已经得到了回答,但它与我的工作非常相关,所以我尝试了一下。我实现了这篇笔记中描述的算法,我从这篇博客文章中找到了链接。不幸的是,它并不比其他提议的方法快,但我确信有一些优化需要进行。
import numpy as np
import matplotlib.pyplot as plt
def lonely(p,X,r):
m = X.shape[1]
x0,y0 = p
x = y = np.arange(-r,r)
x = x + x0
y = y + y0
u,v = np.meshgrid(x,y)
u[u < 0] = 0
u[u >= m] = m-1
v[v < 0] = 0
v[v >= m] = m-1
return not np.any(X[u[:],v[:]] > 0)
def generate_samples(m=2500,r=200,k=30):
# m = extent of sample domain
# r = minimum distance between points
# k = samples before rejection
active_list = []
# step 0 - initialize n-d background grid
X = np.ones((m,m))*-1
# step 1 - select initial sample
x0,y0 = np.random.randint(0,m), np.random.randint(0,m)
active_list.append((x0,y0))
X[active_list[0]] = 1
# step 2 - iterate over active list
while active_list:
i = np.random.randint(0,len(active_list))
rad = np.random.rand(k)*r+r
theta = np.random.rand(k)*2*np.pi
# get a list of random candidates within [r,2r] from the active point
candidates = np.round((rad*np.cos(theta)+active_list[i][0], rad*np.sin(theta)+active_list[i][1])).astype(np.int32).T
# trim the list based on boundaries of the array
candidates = [(x,y) for x,y in candidates if x >= 0 and y >= 0 and x < m and y < m]
for p in candidates:
if X[p] < 0 and lonely(p,X,r):
X[p] = 1
active_list.append(p)
break
else:
del active_list[i]
return X
X = generate_samples(2500, 200, 10)
s = np.where(X>0)
plt.plot(s[0],s[1],'.')
And the results:
结果:
回答by Ender
Per the link, the method from aganders3 is known as Poisson Disc Sampling. You might be able to find more efficient implementations that use a local grid search to find 'overlaps.' For example Poisson Disc Sampling. Because you are constraining the system, it cannot be completely random. The maximum packing for circles with uniform radii in a plane is ~90% and is achieved when the circles are arranged in a perfect hexagonal array. As the number of points you request approaches the theoretical limit, the generated arrangement will become more hexagonal. In my experience, it is difficult to get above ~60% packing with uniform circles using this approach.
根据链接,aganders3 中的方法称为泊松圆盘采样。您或许能够找到使用本地网格搜索来查找“重叠”的更有效的实现。例如泊松圆盘采样。因为你在约束系统,所以它不可能是完全随机的。在平面中具有均匀半径的圆的最大填充为 ~90%,当圆以完美的六边形阵列排列时可实现。随着您请求的点数接近理论极限,生成的排列将变得更加六边形。根据我的经验,使用这种方法很难用统一的圆圈填充超过 60%。