Python 删除numpy数组中的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3877491/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:08:44  来源:igfitidea点击:

deleting rows in numpy array

pythonnumpydelete-row

提问by MedicalMath

I have an array that might look like this:

我有一个可能如下所示的数组:

ANOVAInputMatrixValuesArray = [[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 
0.53172222], [ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]]

Notice that one of the rows has a zero value at the end. I want to delete any row that contains a zero, while keeping any row that contains non-zero values in all cells.

请注意,其中一行末尾的值为零。我想删除任何包含零的行,同时保留所有单元格中包含非零值的任何行。

But the array will have different numbers of rows every time it is populated, and the zeros will be located in different rows each time.

但是每次填充数组时都会有不同的行数,并且零每次都将位于不同的行中。

I get the number of non-zero elements in each row with the following line of code:

我使用以下代码行获取每行中非零元素的数量:

NumNonzeroElementsInRows    = (ANOVAInputMatrixValuesArray != 0).sum(1)

For the array above, NumNonzeroElementsInRowscontains: [5 4]

对于上面的数组,NumNonzeroElementsInRows包含:[5 4]

The five indicates that all possible values in row 0 are nonzero, while the four indicates that one of the possible values in row 1 is a zero.

五个表示第 0 行中的所有可能值都不为零,而四个表示第 1 行中的可能值之一为零。

Therefore, I am trying to use the following lines of code to find and delete rows that contain zero values.

因此,我尝试使用以下代码行来查找和删除包含零值的行。

for q in range(len(NumNonzeroElementsInRows)):
    if NumNonzeroElementsInRows[q] < NumNonzeroElementsInRows.max():
        p.delete(ANOVAInputMatrixValuesArray, q, axis=0)

But for some reason, this code does not seem to do anything, even though doing a lot of print commands indicates that all of the variables seem to be populating correctly leading up to the code.

但出于某种原因,此代码似乎没有做任何事情,即使执行了大量打印命令表明所有变量似乎都正确填充到代码中。

There must be some easy way to simply "delete any row that contains a zero value."

必须有一些简单的方法来简单地“删除任何包含零值的行”。

Can anyone show me what code to write to accomplish this?

谁能告诉我写什么代码来完成这个?

回答by mtrw

This is similar to your original approach, and will use less space than unutbu's answer, but I suspect it will be slower.

这与您的原始方法类似,并且比unutbu's answer使用的空间更少,但我怀疑它会更慢。

>>> import numpy as np
>>> p = np.array([[1.5, 0], [1.4,1.5], [1.6, 0], [1.7, 1.8]])
>>> p
array([[ 1.5,  0. ],
       [ 1.4,  1.5],
       [ 1.6,  0. ],
       [ 1.7,  1.8]])
>>> nz = (p == 0).sum(1)
>>> q = p[nz == 0, :]
>>> q
array([[ 1.4,  1.5],
       [ 1.7,  1.8]])

By the way, your line p.delete()doesn't work for me - ndarrays don't have a .deleteattribute.

顺便说一句,您的线路p.delete()对我不起作用 -ndarray没有.delete属性。

回答by Justin Peel

Here's a one liner (yes, it is similar to user333700's, but a little more straightforward):

这是一个单行(是的,它类似于 user333700,但更简单一些):

>>> import numpy as np
>>> arr = np.array([[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 0.53172222], 
                [ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]])
>>> print arr[arr.all(1)]
array([[ 0.96488889,  0.73641667,  0.67521429,  0.592875  ,  0.53172222]])

By the way, this method is much, much faster than the masked array method for large matrices. For a 2048 x 5 matrix, this method is about 1000x faster.

顺便说一句,这种方法比用于大型矩阵的掩码数组方法快得多。对于 2048 x 5 矩阵,此方法大约快 1000 倍。

By the way, user333700's method (from his comment) was slightly faster in my tests, though it boggles my mind why.

顺便说一下,在我的测试中,user333700 的方法(来自他的评论)稍微快了一点,尽管这让我难以置信。

回答by jeps

numpy provides a simple function to do the exact same thing: supposing you have a masked array 'a', calling numpy.ma.compress_rows(a) will delete the rows containing a masked value. I guess this is much faster this way...

numpy 提供了一个简单的函数来做同样的事情:假设你有一个掩码数组 'a',调用 numpy.ma.compress_rows(a) 将删除包含掩码值的行。我想这样会快得多......

回答by Jaidev Deshpande

The simplest way to delete rows and columns from arrays is the numpy.deletemethod.

从数组中删除行和列的最简单方法是numpy.delete方法。

Suppose I have the following array x:

假设我有以下数组x

x = array([[1,2,3],
        [4,5,6],
        [7,8,9]])

To delete the first row, do this:

要删除第一行,请执行以下操作:

x = numpy.delete(x, (0), axis=0)

To delete the third column, do this:

要删除第三列,请执行以下操作:

x = numpy.delete(x,(2), axis=1)

So you could find the indices of the rows which have a 0 in them, put them in a list or a tuple and pass this as the second argument of the function.

因此,您可以找到其中包含 0 的行的索引,将它们放入列表或元组中,并将其作为函数的第二个参数传递。

回答by troymyname00

I might be too late to answer this question, but wanted to share my input for the benefit of the community. For this example, let me call your matrix 'ANOVA', and I am assuming you're just trying to remove rows from this matrix with 0's only in the 5th column.

我可能来不及回答这个问题,但我想分享我的意见以造福社区。在这个例子中,让我称你的矩阵为“ANOVA”,我假设你只是想从这个矩阵中删除第 5 列中只有 0 的行。

indx = []
for i in range(len(ANOVA)):
    if int(ANOVA[i,4]) == int(0):
        indx.append(i)

ANOVA = [x for x in ANOVA if not x in indx]

回答by Prokhozhii

import numpy as np 
arr = np.array([[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 0.53172222],[ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]])
print(arr[np.where(arr != 0.)])