选择某些行（条件满足），但只选择 Python/Numpy 中的某些列

Question

提问by tim

I have an numpy array with 4 columns and want to select columns 1, 3 and 4, where the value of the second column meets a certain condition (i.e. a fixed value). I tried to first select only the rows, but with all 4 columns via:

我有一个有 4 列的 numpy 数组，想选择第 1、3 和 4 列，其中第二列的值满足某个条件（即固定值）。我尝试首先只选择行，但通过以下方式选择所有 4 列：

I = A[A[:,1] == i]

which works. Then I further tried (similarly to matlab which I know very well):

哪个有效。然后我进一步尝试（类似于我非常熟悉的 matlab）：

I = A[A[:,1] == i, [0,2,3]]

which doesn't work. How to do it?

这不起作用。怎么做？

EXAMPLE DATA:

示例数据：

 >>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
 >>> print A
 [[1 2 3 4]
  [6 1 3 4]
  [3 2 5 6]]
 >>> i = 2

 # I want to get the columns 1, 3 and 4 for every row which has the value i in the second column. In this case, this would be row 1 and 3 with columns 1, 3 and 4:
 [[1 3 4]
  [3 5 6]]

I'm now currently using this:

我现在正在使用这个：

I = A[A[:,1] == i]
I = I[:, [0,2,3]]

But I thought that there had to be a nicer way of doing it... (Im used to MATLAB)

但我认为必须有更好的方法来做到这一点......（我习惯了 MATLAB）

Answer 1

采纳答案by John Zwinck

>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
>>> a
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

>>> a[a[:,0] > 3] # select rows where first column is greater than 3
array([[ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

>>> a[a[:,0] > 3][:,np.array([True, True, False, True])] # select columns
array([[ 5,  6,  8],
       [ 9, 10, 12]])

# fancier equivalent of the previous
>>> a[np.ix_(a[:,0] > 3, np.array([True, True, False, True]))]
array([[ 5,  6,  8],
       [ 9, 10, 12]])

For an explanation of the obscure np.ix_(), see https://stackoverflow.com/a/13599843/4323

有关晦涩的解释np.ix_()，请参阅https://stackoverflow.com/a/13599843/4323

Finally, we can simplify by giving the list of column numbers instead of the tedious boolean mask:

最后，我们可以通过给出列号列表而不是繁琐的布尔掩码来简化：

>>> a[np.ix_(a[:,0] > 3, (0,1,3))]
array([[ 5,  6,  8],
       [ 9, 10, 12]])

Answer 2

回答by genclik27

This also works.

这也有效。

I = np.array([row[[x for x in range(A.shape[1]) if x != i-1]] for row in A if row[i-1] == i])
print I

Edit: Since indexing starts from 0, so

编辑：由于索引从 0 开始，所以

i-1

should be used.

应该使用。

Answer 3

回答by Taha

If you do not want to use boolean positions but the indexes, you can write it this way:

如果你不想使用布尔位置而是索引，你可以这样写：

A[:, [0, 2, 3]][A[:, 1] == i]

Going back to your example:

回到你的例子：

>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
 [6 1 3 4]
 [3 2 5 6]]
>>> i = 2
>>> print A[:, [0, 2, 3]][A[:, 1] == i]
[[1 3 4]
 [3 5 6]]

Seriously,

严重地，

Answer 4

回答by 5up3rf1u0u5

I am hoping this answers your question but a piece of script I have implemented using pandas is:

我希望这能回答您的问题，但我使用 Pandas 实现的一段脚本是：

df_targetrows = df.loc[df[col2filter]*somecondition*, [col1,col2,...,coln]]

For example,

例如，

targets = stockdf.loc[stockdf['rtns'] > .04, ['symbol','date','rtns']]

this will return a dataframe with only columns ['symbol','date','rtns']from stockdfwhere the row value of rtnssatisfies, stockdf['rtns'] > .04

这将返回一个数据帧只列['symbol','date','rtns']从stockdf其中的行值rtns满足，stockdf['rtns'] > .04

hope this helps

希望这可以帮助

Answer 5

回答by Fayaz Ahmed

>>> a=np.array([[1,2,3], [1,3,4], [2,2,5]])
>>> a[a[:,0]==1][:,[0,1]]
array([[1, 2],
       [1, 3]])
>>>

选择某些行（条件满足），但只选择 Python/Numpy 中的某些列

提问by tim

采纳答案by John Zwinck

回答by genclik27

回答by Taha

回答by 5up3rf1u0u5

回答by Fayaz Ahmed

相关推荐

最近更新

标签

选择某些行（条件满足），但只选择 Python/Numpy 中的某些列

提问by tim

采纳答案by John Zwinck

回答by genclik27

回答by Taha

回答by 5up3rf1u0u5

回答by Fayaz Ahmed

相关推荐

使用没有引号的python csv writer

Python 附加在 for 循环中生成的 Pandas 数据帧

Python 用 OpenCV 读取图像并用 Tkinter 显示

Python “元组”不可调用错误

相关推荐

最近更新

标签