Python 在二维 numpy 数组中查找匹配的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25823608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:41:51  来源:igfitidea点击:

Find matching rows in 2 dimensional numpy array

pythonnumpyscipy

提问by b10hazard

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

我想获取与行匹配的二维 Numpy 数组的索引。例如,我的数组是这样的:

vals = np.array([[0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3],
                 [0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3]])

I would like to get the index that matches the row [0, 1] which is index 3 and 15. When I do something like numpy.where(vals == [0 ,1])I get...

我想得到与行 [0, 1] 匹配的索引,即索引 3 和 15。当我做类似的事情时,numpy.where(vals == [0 ,1])我得到...

(array([ 0,  3,  3,  4,  5,  6,  9, 12, 15, 15, 16, 17, 18, 21]), array([0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]))

I want index array([3, 15]).

我想要索引数组([3, 15])。

采纳答案by Aaron Hall

You need the np.wherefunction to get the indexes:

您需要该np.where函数来获取索引:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

Or, as the documentation states:

或者,正如文档所述:

If only condition is given, return condition.nonzero()

如果只给出条件,则返回 condition.nonzero()

You could directly call .nonzero()on the array returned by .all:

您可以直接调用.nonzero()由 返回的数组.all

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

To dissassemble that:

拆解那个:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)

and calling the .allmethod on that array (with axis=1) gives you Truewhere both are True:

.all在该数组上调用该方法(with axis=1)会为您提供True两者都为 True 的位置:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)

and to get which indexes are True:

并获取哪些索引是True

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

or

或者

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)


I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

我发现我的解决方案更具可读性,但正如 unutbu 指出的那样,以下可能更快,并返回与 相同的值(vals == (0, 1)).all(axis=1)

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)

回答by unutbu

In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])


I'm not sure why, but this is significantly faster than
np.where((vals == (0, 1)).all(axis=1)):

我不知道为什么,但这明显快于
np.where((vals == (0, 1)).all(axis=1))

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 μs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 μs per loop

回答by Eelco Hoogendoorn

Using the numpy_indexedpackage, you can simply write:

使用numpy_indexed包,您可以简单地编写:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))