Python numpy.where(condition) 的输出不是数组,而是数组元组:为什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33747908/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:54:17  来源:igfitidea点击:

output of numpy.where(condition) is not an array, but a tuple of arrays: why?

pythonarraysnumpy

提问by Fabio

I am experimenting with the numpy.where(condition[, x, y])function.
From the numpy documentation,I learn that if you give just one array as input, it should return the indices where the array is non-zero (i.e. "True"):

我正在试验这个numpy.where(condition[, x, y])功能。
numpy 文档中,我了解到如果你只提供一个数组作为输入,它应该返回数组非零的索引(即“真”):

If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.

如果只给出条件,则返回元组 condition.nonzero(),条件为 True 的索引。

But if try it, it returns me a tupleof two elements, where the first is the wanted list of indices, and the second is a null element:

但是如果尝试一下,它会返回一个包含两个元素的元组,其中第一个是想要的索引列表,第二个是一个空元素:

>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> np.where(array>4)
(array([4, 5, 6, 7, 8]),) # notice the comma before the last parenthesis

so the question is: why? what is the purpose of this behaviour? in what situation this is useful? Indeed, to get the wanted list of indices I have to add the indexing, as in np.where(array>4)[0], which seems... "ugly".

所以问题是:为什么?这种行为的目的是什么?在什么情况下这是有用的?事实上,为了获得想要的索引列表,我必须添加索引,如np.where(array>4)[0],这看起来......“丑陋”。



ADDENDUM

附录

I understand (from some answers) that it is actually a tuple of just one element. Still I don't understand why to give the output in this way. To illustrate how this is not ideal, consider the following error (which motivated my question in the first place):

我理解(从一些答案中)它实际上只是一个元素的元组。我仍然不明白为什么要以这种方式给出输出。为了说明这如何不理想,请考虑以下错误(这首先激发了我的问题):

>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> pippo = np.where(array>4)
>>> pippo + 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "int") to tuple

so that you need to do some indexing to access the actual array of indices:

所以你需要做一些索引来访问实际的索引数组:

>>> pippo[0] + 1
array([5, 6, 7, 8, 9])

采纳答案by hpaulj

In Python (1)means just 1. ()can be freely added to group numbers and expressions for human readability (e.g. (1+3)*3v (1+3,)*3). Thus to denote a 1 element tuple it uses (1,)(and requires you to use it as well).

在 Python 中(1)意味着只是1. ()可以自由添加到组号和表达式中以提高可读性(例如(1+3)*3v (1+3,)*3)。因此,要表示它使用的 1 元素元组(1,)(并要求您也使用它)。

Thus

因此

(array([4, 5, 6, 7, 8]),)

is a one element tuple, that element being an array.

是一个单元素元组,该元素是一个数组。

If you applied whereto a 2d array, the result would be a 2 element tuple.

如果您应用于where2d 数组,结果将是一个 2 元素元组。

The result of whereis such that it can be plugged directly into an indexing slot, e.g.

结果where是它可以直接插入索引槽,例如

a[where(a>0)]
a[a>0]

should return the same things

应该返回相同的东西

as would

一样

I,J = where(a>0)   # a is 2d
a[I,J]
a[(I,J)]

Or with your example:

或者用你的例子:

In [278]: a=np.array([1,2,3,4,5,6,7,8,9])
In [279]: np.where(a>4)
Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),)  # tuple

In [280]: a[np.where(a>4)]
Out[280]: array([5, 6, 7, 8, 9])

In [281]: I=np.where(a>4)
In [282]: I
Out[282]: (array([4, 5, 6, 7, 8], dtype=int32),)
In [283]: a[I]
Out[283]: array([5, 6, 7, 8, 9])

In [286]: i, = np.where(a>4)   # note the , on LHS
In [287]: i
Out[287]: array([4, 5, 6, 7, 8], dtype=int32)  # not tuple
In [288]: a[i]
Out[288]: array([5, 6, 7, 8, 9])
In [289]: a[(i,)]
Out[289]: array([5, 6, 7, 8, 9])

======================

======================

np.flatnonzeroshows the correct way of returning just one array, regardless of the dimensions of the input array.

np.flatnonzero显示了只返回一个数组的正确方法,而不管输入数组的维数。

In [299]: np.flatnonzero(a>4)
Out[299]: array([4, 5, 6, 7, 8], dtype=int32)
In [300]: np.flatnonzero(a>4)+10
Out[300]: array([14, 15, 16, 17, 18], dtype=int32)

It's doc says:

它的医生说:

This is equivalent to a.ravel().nonzero()[0]

这相当于 a.ravel().nonzero()[0]

In fact that is literally what the function does.

事实上,这就是函数所做的。

By flattening aremoves the question of what to do with multiple dimensions. And then it takes the response out of the tuple, giving you a plain array. With flattening it doesn't have make a special case for 1d arrays.

通过展平a消除了如何处理多个维度的问题。然后它从元组中取出响应,为您提供一个普通数组。通过展平,它不会对一维数组进行特殊处理。

===========================

============================

@Divakar suggests np.argwhere:

@Divakar 建议np.argwhere

In [303]: np.argwhere(a>4)
Out[303]: 
array([[4],
       [5],
       [6],
       [7],
       [8]], dtype=int32)

which does np.transpose(np.where(a>4))

哪个 np.transpose(np.where(a>4))

Or if you don't like the column vector, you could transpose it again

或者如果你不喜欢列向量,你可以再次转置它

In [307]: np.argwhere(a>4).T
Out[307]: array([[4, 5, 6, 7, 8]], dtype=int32)

except now it is a 1xn array.

除了现在它是一个 1xn 数组。

We could just as well have wrapped wherein array:

我们也可以包含wherearray

In [311]: np.array(np.where(a>4))
Out[311]: array([[4, 5, 6, 7, 8]], dtype=int32)

Lots of ways of taking an array out the wheretuple ([0], i,=, transpose, array, etc).

大量的以阵列出来的方式where元组([0]i,=transposearray,等等)。

回答by jakevdp

Short answer: np.whereis designed to have consistent output regardless of the dimension of the array.

简短回答:np.where无论数组的维度如何,都旨在获得一致的输出。

A two-dimensional array has two indices, so the result of np.whereis a length-2 tuple containing the relevant indices. This generalizes to a length-3 tuple for 3-dimensions, a length-4 tuple for 4 dimensions, or a length-N tuple for N dimensions. By this rule, it is clear that in 1 dimension, the result should be a length-1 tuple.

二维数组有两个索引,因此结果np.where是一个包含相关索引的长度为 2 的元组。这可以推广到 3 维的长度为 3 的元组、4 维的长度为 4 的元组或 N 维的长度为 N 的元组。根据这个规则,很明显,在一维中,结果应该是一个长度为 1 的元组。

回答by Panagiotis Simakis

Just use np.asarrayfunction. In your case:

就用np.asarray函数吧。在你的情况下:

>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> pippo = np.asarray(np.where(array>4))
>>> pippo + 1
array([[5, 6, 7, 8, 9]])