选择某些行(条件满足),但只选择 Python/Numpy 中的某些列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23911875/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select certain rows (condition met), but only some columns in Python/Numpy
提问by tim
I have an numpy array with 4 columns and want to select columns 1, 3 and 4, where the value of the second column meets a certain condition (i.e. a fixed value). I tried to first select only the rows, but with all 4 columns via:
我有一个有 4 列的 numpy 数组,想选择第 1、3 和 4 列,其中第二列的值满足某个条件(即固定值)。我尝试首先只选择行,但通过以下方式选择所有 4 列:
I = A[A[:,1] == i]
which works. Then I further tried (similarly to matlab which I know very well):
哪个有效。然后我进一步尝试(类似于我非常熟悉的 matlab):
I = A[A[:,1] == i, [0,2,3]]
which doesn't work. How to do it?
这不起作用。怎么做?
EXAMPLE DATA:
示例数据:
>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
[6 1 3 4]
[3 2 5 6]]
>>> i = 2
# I want to get the columns 1, 3 and 4 for every row which has the value i in the second column. In this case, this would be row 1 and 3 with columns 1, 3 and 4:
[[1 3 4]
[3 5 6]]
I'm now currently using this:
我现在正在使用这个:
I = A[A[:,1] == i]
I = I[:, [0,2,3]]
But I thought that there had to be a nicer way of doing it... (Im used to MATLAB)
但我认为必须有更好的方法来做到这一点......(我习惯了 MATLAB)
采纳答案by John Zwinck
>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> a[a[:,0] > 3] # select rows where first column is greater than 3
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> a[a[:,0] > 3][:,np.array([True, True, False, True])] # select columns
array([[ 5, 6, 8],
[ 9, 10, 12]])
# fancier equivalent of the previous
>>> a[np.ix_(a[:,0] > 3, np.array([True, True, False, True]))]
array([[ 5, 6, 8],
[ 9, 10, 12]])
For an explanation of the obscure np.ix_()
, see https://stackoverflow.com/a/13599843/4323
有关晦涩的解释np.ix_()
,请参阅https://stackoverflow.com/a/13599843/4323
Finally, we can simplify by giving the list of column numbers instead of the tedious boolean mask:
最后,我们可以通过给出列号列表而不是繁琐的布尔掩码来简化:
>>> a[np.ix_(a[:,0] > 3, (0,1,3))]
array([[ 5, 6, 8],
[ 9, 10, 12]])
回答by genclik27
This also works.
这也有效。
I = np.array([row[[x for x in range(A.shape[1]) if x != i-1]] for row in A if row[i-1] == i])
print I
Edit: Since indexing starts from 0, so
编辑:由于索引从 0 开始,所以
i-1
should be used.
应该使用。
回答by Taha
If you do not want to use boolean positions but the indexes, you can write it this way:
如果你不想使用布尔位置而是索引,你可以这样写:
A[:, [0, 2, 3]][A[:, 1] == i]
Going back to your example:
回到你的例子:
>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
[6 1 3 4]
[3 2 5 6]]
>>> i = 2
>>> print A[:, [0, 2, 3]][A[:, 1] == i]
[[1 3 4]
[3 5 6]]
Seriously,
严重地,
回答by 5up3rf1u0u5
I am hoping this answers your question but a piece of script I have implemented using pandas is:
我希望这能回答您的问题,但我使用 Pandas 实现的一段脚本是:
df_targetrows = df.loc[df[col2filter]*somecondition*, [col1,col2,...,coln]]
For example,
例如,
targets = stockdf.loc[stockdf['rtns'] > .04, ['symbol','date','rtns']]
this will return a dataframe with only columns ['symbol','date','rtns']
from stockdf
where the row value of rtns
satisfies, stockdf['rtns'] > .04
这将返回一个数据帧只列['symbol','date','rtns']
从stockdf
其中的行值rtns
满足,stockdf['rtns'] > .04
hope this helps
希望这可以帮助
回答by Fayaz Ahmed
>>> a=np.array([[1,2,3], [1,3,4], [2,2,5]])
>>> a[a[:,0]==1][:,[0,1]]
array([[1, 2],
[1, 3]])
>>>