Python NumPy 使用索引列表选择每行特定的列索引

Question

提问by Zee

I'm struggling to select the specific columns per row of a NumPymatrix.

我正在努力选择NumPy矩阵每行的特定列。

Suppose I have the following matrix which I would call X:

假设我有以下矩阵，我会称之为X：

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]

I also have a listof column indexes per every row which I would call Y:

我list每行都有一个列索引，我称之为Y：

[1, 0, 2]

I need to get the values:

我需要获取值：

[2]
[4]
[9]

Instead of a listwith indexes Y, I can also produce a matrix with the same shape as Xwhere every column is a bool/ intin the range 0-1 value, indicating whether this is the required column.

除了listwith 索引Y，我还可以生成一个形状与X其中每一列都是0-1 值范围内的bool/相同形状的矩阵int，指示这是否是所需的列。

[0, 1, 0]
[1, 0, 0]
[0, 0, 1]

I know this can be done with iterating over the array and selecting the column values I need. However, this will be executed frequently on big arrays of data and that's why it has to run as fast as it can.

我知道这可以通过迭代数组并选择我需要的列值来完成。但是，这将在大量数据上频繁执行，这就是它必须尽可能快地运行的原因。

I was thus wondering if there is a better solution?

因此我想知道是否有更好的解决方案？

Thank you.

谢谢你。

Answer 1

采纳答案by Slater Victoroff

If you've got a boolean array you can do direct selection based on that like so:

如果你有一个布尔数组，你可以根据它进行直接选择，如下所示：

>>> a = np.array([True, True, True, False, False])
>>> b = np.array([1,2,3,4,5])
>>> b[a]
array([1, 2, 3])

To go along with your initial example you could do the following:

为了配合您的初始示例，您可以执行以下操作：

>>> a = np.array([[1,2,3], [4,5,6], [7,8,9]])
>>> b = np.array([[False,True,False],[True,False,False],[False,False,True]])
>>> a[b]
array([2, 4, 9])

You can also add in an arangeand do direct selection on that, though depending on how you're generating your boolean array and what your code looks like YMMV.

您还可以添加一个arange并直接选择它，但这取决于您生成布尔数组的方式以及您的代码看起来像 YMMV。

>>> a = np.array([[1,2,3], [4,5,6], [7,8,9]])
>>> a[np.arange(len(a)), [1,0,2]]
array([2, 4, 9])

Hope that helps, let me know if you've got any more questions.

希望有帮助，如果您还有其他问题，请告诉我。

Answer 2

回答by Ashwini Chaudhary

You can do something like this:

你可以这样做：

In [7]: a = np.array([[1, 2, 3],
   ...: [4, 5, 6],
   ...: [7, 8, 9]])

In [8]: lst = [1, 0, 2]

In [9]: a[np.arange(len(a)), lst]
Out[9]: array([2, 4, 9])

More on indexing multi-dimensional arrays: http://docs.scipy.org/doc/numpy/user/basics.indexing.html#indexing-multi-dimensional-arrays

有关索引多维数组的更多信息：http: //docs.scipy.org/doc/numpy/user/basics.indexing.html#indexing-multi-dimensional-arrays

Answer 3

回答by Kei Minagawa

You can do it by using iterator. Like this:

您可以使用迭代器来完成。像这样：

np.fromiter((row[index] for row, index in zip(X, Y)), dtype=int)

Time:

时间：

N = 1000
X = np.zeros(shape=(N, N))
Y = np.arange(N)

#@A?wini ?haudhary
%timeit X[np.arange(len(X)), Y]
10000 loops, best of 3: 30.7 us per loop

#mine
%timeit np.fromiter((row[index] for row, index in zip(X, Y)), dtype=int)
1000 loops, best of 3: 1.15 ms per loop

#mine
%timeit np.diag(X.T[Y])
10 loops, best of 3: 20.8 ms per loop

Answer 4

回答by Dhaval Mayatra

A simple way might look like:

一种简单的方法可能如下所示：

In [1]: a = np.array([[1, 2, 3],
   ...: [4, 5, 6],
   ...: [7, 8, 9]])

In [2]: y = [1, 0, 2]  #list of indices we want to select from matrix 'a'

range(a.shape[0])will return array([0, 1, 2])

range(a.shape[0])将返回 array([0, 1, 2])

In [3]: a[range(a.shape[0]), y] #we're selecting y indices from every row
Out[3]: array([2, 4, 9])

Answer 5

回答by Thomas Devoogdt

Another clever way is to first transpose the array and index it thereafter. Finally, take the diagonal, its always the right answer.

另一个聪明的方法是先转置数组，然后再索引它。最后，取对角线，它总是正确的答案。

X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
Y = np.array([1, 0, 2, 2])

np.diag(X.T[Y])

Step by step:

一步步：

Original arrays:

原始数组：

>>> X
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

>>> Y
array([1, 0, 2, 2])

Transpose to make it possible to index it right.

转置以使其可以正确索引。

>>> X.T
array([[ 1,  4,  7, 10],
       [ 2,  5,  8, 11],
       [ 3,  6,  9, 12]])

Get rows in the Y order.

按 Y 顺序获取行。

>>> X.T[Y]
array([[ 2,  5,  8, 11],
       [ 1,  4,  7, 10],
       [ 3,  6,  9, 12],
       [ 3,  6,  9, 12]])

The diagonal should now become clear.

对角线现在应该变得清晰。

>>> np.diag(X.T[Y])
array([ 2,  4,  9, 12]

Answer 6

回答by hpaulj

Recent numpyversions have added a take_along_axis(and put_along_axis) that does this indexing cleanly.

最近的numpy版本添加了一个take_along_axis(和put_along_axis)，可以干净地进行索引。

In [101]: a = np.arange(1,10).reshape(3,3)                                                             
In [102]: b = np.array([1,0,2])                                                                        
In [103]: np.take_along_axis(a, b[:,None], axis=1)                                                     
Out[103]: 
array([[2],
       [4],
       [9]])

It operates in the same way as:

它的运作方式与：

In [104]: a[np.arange(3), b]                                                                           
Out[104]: array([2, 4, 9])

but with different axis handling. It's especially aimed at applying the results of argsortand argmax.

但具有不同的轴处理。这是特别针对应用的结果argsort和argmax。

Python NumPy 使用索引列表选择每行特定的列索引

提问by Zee

采纳答案by Slater Victoroff

回答by Ashwini Chaudhary

回答by Kei Minagawa

回答by Dhaval Mayatra

回答by Thomas Devoogdt

回答by hpaulj

相关推荐

最近更新

标签

Python NumPy 使用索引列表选择每行特定的列索引

提问by Zee

采纳答案by Slater Victoroff

回答by Ashwini Chaudhary

回答by Kei Minagawa

回答by Dhaval Mayatra

回答by Thomas Devoogdt

回答by hpaulj

相关推荐

检查python中的值是否为零或不为空

Python Spark Context Textfile：加载多个文件

Python 导入错误：在 windows7 32 位中运行 pip --version 命令时无法导入名称 main

Python 围绕给定大小区域轮廓绘制边界框

相关推荐

最近更新

标签