python 测试点是否在某个矩形中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1897779/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 23:18:00  来源:igfitidea点击:

Test if point is in some rectangle

pythonalgorithmpoint

提问by pafcu

I have a large collection of rectangles, all of the same size. I am generating random points that should not fall in these rectangles, so what I wish to do is test if the generated point lies in one of the rectangles, and if it does, generate a new point.

我有一大堆矩形,所有的大小都一样。我正在生成不应落在这些矩形中的随机点,所以我希望做的是测试生成的点是否位于其中一个矩形中,如果是,则生成一个新点。

Using R-trees seem to work, but they are really meant for rectangles and not points. I could use a modified version of a R-tree algorithm which works with points too, but I'd rather not reinvent the wheel, if there is already some better solution. I'm not very familiar with data-structures, so maybe there already exists some structure that works for my problem?

使用 R 树似乎有效,但它们实际上适用于矩形而不是点。我可以使用 R 树算法的修改版本,它也适用于点,但如果已经有更好的解决方案,我宁愿不重新发明轮子。我对数据结构不是很熟悉,所以也许已经存在一些适合我的问题的结构?

In summary, basically what I'm asking is if anyone knows of a good algorithm, that works in Python, that can be used to check if a point lies in any rectangle in a given set of rectangles.

总而言之,基本上我要问的是,是否有人知道一个在 Python 中工作的好算法,可以用来检查一个点是否位于给定矩形集中的任何矩形中。

edit: This is in 2D and the rectangles are not rotated.

编辑:这是二维的,矩形没有旋转。

采纳答案by Roee Adler

This Reddit thread addresses your problem:

这个 Reddit 线程解决了您的问题:

I have a set of rectangles, and need to determine whether a point is contained within any of them. What are some good data structures to do this, with fast lookup being important?

我有一组矩形,需要确定一个点是否包含在其中任何一个中。有什么好的数据结构可以做到这一点,快速查找很重要?

If your universe is integer, or if the level of precision is well known and is not too high, you can use abelsson's suggestion from the thread, using O(1) lookup using coloring:

如果您的 Universe 是整数,或者如果精度水平众所周知并且不太高,您可以使用abelsson来自线程的建议,使用 O(1) 查找使用着色:

As usual you can trade space for time.. here is a O(1) lookup with very low constant. init: Create a bitmap large enough to envelop all rectangles with sufficient precision, initialize it to black. Color all pixels containing any rectangle white. O(1) lookup: is the point (x,y) white? If so, a rectangle was hit.

像往常一样,您可以用空间换取时间.. 这是一个 O(1) 查找,常数非常低。init:创建一个足够大的位图以足够的精度包围所有矩形,将其初始化为黑色。将包含任何矩形的所有像素着色为白色。O(1) 查找:点 (x,y) 是白色的吗?如果是这样,则击中了一个矩形。

I recommend you go to that post and fully read ModernRonin's answer which is the most accepted one. I pasted it here:

我建议你去那篇文章并完整阅读ModernRonin的答案,这是最被接受的答案。我把它贴在这里:

First, the micro problem. You have an arbitrarily rotated rectangle, and a point. Is the point inside the rectangle?

There are many ways to do this. But the best, I think, is using the 2d vector cross product. First, make sure the points of the rectangle are stored in clockwise order. Then do the vector cross product with 1) the vector formed by the two points of the side and 2) a vector from the first point of the side to the test point. Check the sign of the result - positive is inside (to the right of) the side, negative is outside. If it's inside all four sides, it's inside the rectangle. Or equivalently, if it's outside any of the sides, it's outside the rectangle. More explanation here.

This method will take 3 subtracts per vector * times 2 vectors per side, plus one cross product per side which is three multiplies and two adds. 11 flops per side, 44 flops per rectangle.

If you don't like the cross product, then you could do something like: figure out the inscribed and circumscribed circles for each rectangle, check if the point inside the inscribed one. If so, it's in the rectangle as well. If not, check if it's outside the circumscribed rectangle. If so, it's outside the rectangle as well. If it falls between the two circles, you're f****d and you have to check it the hard way.

Finding if a point is inside a circle in 2d takes two subtractions and two squarings (= multiplies), and then you compare distance squared to avoid having to do a square root. That's 4 flops, times two circles is 8 flops - but sometimes you still won't know. Also this assumes that you don't pay any CPU time to compute the circumscribed or inscribed circles, which may or may not be true depending on how much pre-computation you're willing to do on your rectangle set.

In any event, it's probably not a great idea to test the point against every rectangle, especially if you have a hundred million of them.

Which brings us to the macro problem. How to avoid testing the point against every single rectangle in the set? In 2D, this is probably a quad-treeproblem. In 3d, what generic_handle said - an octree. Off the top of my head, I would probably implement it as a B+ tree. It's tempting to use d = 5, so that each node can have up to 4 children, since that maps so nicely onto the quad-tree abstraction. But if the set of rectangles is too big to fit into main memory (not very likely these days), then having nodes the same size as disk blocks is probably the way to go.

Watch out for annoying degenerate cases, like some data set that has ten thousand nearly identical rectangles with centers at the same exact point. :P

Why is this problem important? It's useful in computer graphics, to check if a ray intersects a polygon. I.e., did that sniper rifle shot you just made hit the person you were shooting at? It's also used in real-time map software, like say GPS units. GPS tells you the coordinates you're at, but the map software has to find where that point is in a huge amount of map data, and do it several times per second.

第一,微观问题。你有一个任意旋转的矩形和一个点。点在矩形内吗?

有很多方法可以做到这一点。但我认为最好的方法是使用二维向量叉积。首先,确保矩形的点按顺时针顺序存储。然后与1)边的两个点形成的向量和2)从边的第一个点到测试点的向量做向量叉积。检查结果的符号 - 正面在侧面(右侧),负面在侧面。如果它在所有四个边内,则它在矩形内。或者等效地,如果它在任何边之外,则它在矩形之外。更多解释在这里

此方法将每个向量进行 3 次减法 * 每边乘以 2 个向量,再加上每边一个叉积,即三个乘法和两个加法。每边 11 个触发器,每个矩形 44 个触发器。

如果您不喜欢叉积,那么您可以执行以下操作:计算每个矩形的内切圆和外切圆,检查内切圆内的点。如果是这样,它也在矩形中。如果不是,请检查它是否在外接矩形之外。如果是这样,它也在矩形之外。如果它落在两个圆圈之间,那你就完蛋了,你必须用艰难的方式检查它。

查找点是否在 2d 中的圆内需要两次减法和两次平方(= 乘法),然后比较距离的平方以避免必须做平方根。那是 4 次失败,乘以两个圆圈是 8 次失败 - 但有时你仍然不知道。此外,这假设您不花任何 CPU 时间来计算外接圆或内切圆,这可能是也可能不是真的,这取决于您愿意对矩形集进行多少预计算。

无论如何,针对每个矩形测试该点可能不是一个好主意,尤其是当您有数亿个矩形时。

这给我们带来了宏观问题。如何避免针对集合中的每个矩形测试点?在 2D 中,这可能是一个四叉树问题。在 3d 中,generic_handle 所说的 - 八叉树。在我的头顶上,我可能会将它实现为B+ 树。使用 d = 5 很诱人,这样每个节点最多可以有 4 个子节点,因为这很好地映射到四叉树抽象。但是,如果矩形集太大而无法放入主内存(现在不太可能),那么让节点与磁盘块大小相同可能是可行的方法。

注意令人讨厌的退化情况,例如某些数据集具有一万个几乎相同的矩形,其中心位于同一精确点。:P

为什么这个问题很重要?它在计算机图形学中很有用,可以检查光线是否与多边形相交。也就是说,你刚才射出的狙击步枪有没有击中你正在射击的人?它还用于实时地图软件,例如 GPS 单位。GPS 会告诉您您所在的坐标,但地图软件必须在大量地图数据中找到该点的位置,并且每秒执行几次。

Again, credit to ModernRonin...

再次,归功于ModernRonin......

回答by Jonathan Leffler

For rectangles that are aligned with the axes, you only need two points (four numbers) to identify the rectangle - conventionally, bottom-left and top-right corners. To establish whether a given point (Xtest, Ytest) overlaps with a rectangle (XBL, YBL, XTR, YTR) by testing both:

对于与轴对齐的矩形,您只需要两个点(四个数字)来标识矩形 - 通常是左下角和右上角。要确定给定点 (X test, Y test) 是否与矩形 (X BL, Y BL, X TR, Y TR)重叠,请同时测试:

  • Xtest>= XBL&& Xtest<= XTR
  • Ytest>= YBL&& Ytest<= YTR
  • X测试>= X BL&& X测试<= X TR
  • Y测试>= Y BL&& Y测试<= Y TR

Clearly, for a large enough set of points to test, this could be fairly time consuming. The question, then, is how to optimize the testing.

显然,对于足够大的一组要测试的点,这可能相当耗时。那么,问题是如何优化测试。

Clearly, one optimization is to establish the minimum and maximum X and Y values for the box surrounding all the rectangles (the bounding box): a swift test on this shows whether there is any need to look further.

显然,一种优化是为围绕所有矩形(边界框)的框建立最小和最大 X 和 Y 值:对此的快速测试表明是否需要进一步查看。

  • Xtest>= Xmin&& Xtest<= Xmax
  • Ytest>= Ymin&& Ytest<= Ymax
  • X测试>= X最小值&& X测试<= X最大值
  • Y test>= Y min&& Y test<= Y max

Depending on how much of the total surface area is covered with rectangles, you might be able to find non-overlapping sub-areas that contain rectangles, and you could then avoid searching those sub-areas that cannot contain a rectangle overlapping the point, again saving comparisons during the search at the cost of pre-computation of suitable data structures. If the set of rectangles is sparse enough, there may be no overlapping, in which case this degenerates into the brute-force search. Equally, if the set of rectangles is so dense that there are no sub-ranges in the bounding box that can be split up without breaking rectangles.

根据有多少总表面积被矩形覆盖,您可能能够找到包含矩形的非重叠子区域,然后您可以避免再次搜索那些不能包含与该点重叠的矩形的子区域以预先计算合适的数据结构为代价,在搜索过程中节省比较。如果矩形集足够稀疏,则可能没有重叠,在这种情况下,这将退化为蛮力搜索。同样,如果矩形集非常密集,以至于边界框中没有可以在不破坏矩形的情况下拆分的子范围。

However, you could also arbitrarily break up the bounding area into, say, quarters (half in each direction). You would then use a list of boxes which would include more boxes than in the original set (two or four boxes for each box that overlapped one of the arbitrary boundaries). The advantage of this is that you could then eliminate three of the four quarters from the search, reducing the amount of searching to be done in total - at the expense of auxilliary storage.

但是,您也可以任意将边界区域分成四等分(每个方向各一半)。然后,您将使用一个包含比原始集合更多的框的框列表(每个与任意边界重叠的框有两个或四个框)。这样做的好处是,您可以从搜索中消除四个四分之一中的三个,从而减少要完成的搜索总量 - 以牺牲辅助存储为代价。

So, there are space-time trade-offs, as ever. And pre-computation versus search trade-offs. If you are unlucky, the pre-computation achieves nothing (for example, there are two boxes only, and they don't overlap on either axis). On the other hand, it could achieve considerable search-time benefit.

因此,与以往一样,存在时空权衡。以及预计算与搜索权衡。如果不走运,预计算什么也没有(例如,只有两个框,并且它们在任何一个轴上都不重叠)。另一方面,它可以实现可观的搜索时间优势。

回答by Frank

I suggest you take a look at BSP trees(and possible quadtrees or octrees, links available on that page as well). They are used to partition the whole space recursively and allow you to quickly check for a point which rectangles you need to check at all.

我建议您查看BSP 树(以及可能的四叉树或八叉树,以及该页面上可用的链接)。它们用于递归地划分整个空间,并允许您快速检查一个点,您需要检查哪些矩形。

At minimum you just have one huge partition and need to check all rectangles, at maximum your partitions get so small, that they get down to the size of single rectangles. Of course the more fine-grained the partition, the longer you need to walk down the tree in order to find the rectangles you want to check.

至少你只有一个巨大的分区并且需要检查所有的矩形,最多你的分区变得如此之小,以至于它们缩小到单个矩形的大小。当然,分区的粒度越细,您需要沿着树向下走的时间越长,以便找到要检查的矩形。

However, you can freely decide how many rectangles are suitable to be checked for a point and then create the corresponding structure.

但是,您可以自由决定适合检查一个点的矩形数量,然后创建相应的结构。

Pay attention to overlapping rectangles though. As the BSP tree needs to be precomputed anyways, you may as well remove overlaps during that time, so you can get clear partitions.

不过要注意重叠的矩形。由于 BSP 树无论如何都需要预先计算,因此您最好删除这段时间内的重叠部分,以便获得清晰的分区。

回答by Eric O Lebigot

Your R-tree approach is the best approach I know of (that's the approach I would choose over quadtrees, B+ trees, or BSP trees, as R-trees seem convenient to build in your case). Caveat: I'm no expert, even though I remember a few things from my senior year university class of algorithmic!

你的 R 树方法是我所知道的最好的方法(这是我会选择四叉树、B+ 树或 BSP 树的方法,因为 R 树在你的情况下似乎很容易构建)。警告:我不是专家,尽管我记得我大学四年级算法课上的一些事情!

回答by John R Doner

Why not try this. It seems rather light on both computation and memory.

为什么不试试这个。在计算和内存方面似乎都比较轻松。

Consider the projections of all the rectangles onto the base line of your space. Denote that set of line intervals as

考虑所有矩形在空间基线上的投影。将该组线间隔表示为

 {[Rl1, Rr1], [Rl2, Rr2],..., [Rln, Rrn]}, ordered by increasing left coordinates. 

Now suppose your point is (x, y), start a search at the left of this set until you reach a line interval that contains the point x.

现在假设您的点是 (x, y),从该集合的左侧开始搜索,直到到达包含点 x 的行间隔。

If none does, your point (x,y) is outside all rectangles.

如果没有,则您的点 (x,y) 位于所有矩形之外。

If some do, say [Rlk, Rrk], ..., [Rlh, Rrh], (k <= h) then just check whether y is within the vertical extent of any of these rectangles.

如果有的话,说 [Rlk, Rrk], ..., [Rlh, Rrh], (k <= h) 然后检查 y 是否在这些矩形中的任何一个的垂直范围内。

Done.

完毕。

Good luck.

祝你好运。

John Doner

约翰·多纳