Python 在 numpy 数组中设置空值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27527947/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:55:40  来源:igfitidea点击:

setting null values in a numpy array

pythonarraysnumpy

提问by idem

how do I null certain values in numpy array based on a condition? I don't understand why I end up with 0 instead of null or empty values where the condition is not met... b is a numpy array populated with 0 and 1 values, c is another fully populated numpy array. All arrays are 71x71x166

如何根据条件将 numpy 数组中的某些值归零?我不明白为什么在不满足条件的情况下,我最终会得到 0 而不是空值或空值...... b 是一个填充了 0 和 1 值的 numpy 数组,c 是另一个完全填充的 numpy 数组。所有阵列均为 71x71x166

a = np.empty(((71,71,166)))
d = np.empty(((71,71,166)))
for indexes, value in np.ndenumerate(b):
    i,j,k = indexes
    a[i,j,k] = np.where(b[i,j,k] == 1, c[i,j,k], d[i,j,k])

I want to end up with an array which only has values where the condition is met and is empty everywhere else but with out changing its shape

我想最终得到一个数组,它只有满足条件的值,其他地方都是空的,但不改变它的形状

FULL ISSUE FOR CLARIFICATION as asked for:
I start with a float populated array with shape (71,71,166)
I make an int array based on a cutoff applied to the float array basically creating a number of bins, roughly marking out 10 areas within the array with 0 values in between
What I want to end up with is an array with shape (71,71,166) which has the average values in a particular array direction (assuming vertical direction, if you think of a 3D array as a 3D cube) of a certain "bin"...
so I was trying to loop through the "bins" b == 1, b == 2 etc, sampling the float where that condition is met but being null elsewhere so I can take the average, and then recombine into one array at the end of the loop....
Not sure if I'm making myself understood. I'm using the np.where and using the indexing as I keep getting errors when I try and do it without although it feels very inefficient.

按照要求进行澄清的完整问题:
我从一个形状为 (71,71,166) 的浮点填充数组开始,
我根据应用于浮点数组的截止值制作了一个 int 数组,基本上创建了许多垃圾箱,大致标出了其中的 10 个区域中间有 0 个值的数组
我想要最终得到的是一个形状为 (71,71,166) 的数组,它具有特定数组方向的平均值(假设垂直方向,如果您将 3D 数组视为 3D 立方体)某个“bin”......
所以我试图遍历“bins” b == 1, b == 2 等,对满足该条件的浮点数进行采样,但在其他地方为空,这样我就可以取平均值,然后在循环结束时重新组合成一个数组....
不确定我是否让自己理解。我正在使用 np.where 并使用索引,因为当我尝试这样做时我不断收到错误,尽管它感觉效率很低。

回答by idem

np.emptysometimes fills the array with 0's; it's undefined what the contents of an empty()array is, so 0 is perfectly valid. For example, try this instead:

np.empty有时用 0 填充数组;未定义empty()数组的内容是什么,因此 0 是完全有效的。例如,试试这个:

d = np.nan * np.empty((71, 71, 166)).

But consider using numpy's strength, and don't iterate over the array:

但考虑使用 numpy 的强度,不要遍历数组:

a = np.where(b, c, d)

(since bis 0 or 1, I've excluded the explicit comparison b == 1.)

(因为b是 0 或 1,我排除了显式比较b == 1。)

You may even want to consider using a masked arrayinstead:

您甚至可能需要考虑使用掩码数组

a = np.ma.masked_where(b, c)

which seems to make more sense with respect to your question: "how do I null certain values in a numpy array based on a condition" (replace null with mask and you're done).

对于您的问题,这似乎更有意义:“我如何根据条件将 numpy 数组中的某些值归零”(用掩码替换 null 就完成了)。

回答by jmilloy

Consider this example:

考虑这个例子:

import numpy as np
data = np.random.random((4,3))
mask = np.random.random_integers(0,1,(4,3))
data[mask==0] = np.NaN

The data will be set to nanwherever the maskis 0. You can use any kind of condition you want, of course, or do something different for different values in b.

这些数据将被设置到nan的任何地方mask为0。你可以使用任何你想要的那种,当然条件,或者您在B不同的值做不同的事情。

To erase everything except a specific bin, try the following:

要擦除除特定垃圾箱之外的所有内容,请尝试以下操作:

c[b!=1] = np.NaN

So, to make a copy of everything in a specific bin:

因此,要复制特定 bin 中的所有内容:

a = np.copy(c)
a[b!=1] == np.NaN

To get the average of everything in a bin:

要获得 bin 中所有内容的平均值:

np.mean(c[b==1])

So perhaps this might do what you want (where bins is a list of bin values):

所以也许这可能会做你想要的(其中 bins 是 bin 值列表):

a = np.empty(c.shape)
a[b==0] = np.NaN
for bin in bins:
    a[b==bin] = np.mean(c[b==bin])