python如何用零填充numpy数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35751306/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:55:21  来源:igfitidea点击:

python how to pad numpy array with zeros

pythonarraysnumpypad

提问by user2015487

I want to know how I can pad a 2D numpy array with zeros using python 2.6.6 with numpy version 1.5.0. Sorry! But these are my limitations. Therefore I cannot use np.pad. For example, I want to pad awith zeros such that its shape matches b. The reason why I want to do this is so I can do:

我想知道如何使用 python 2.6.6 和 numpy 版本 1.5.0 用零填充二维 numpy 数组。对不起!但这些都是我的局限。因此我不能使用np.pad. 例如,我想a用零填充,使其形状匹配b. 我想这样做的原因是我可以这样做:

b-a

such that

以至于

>>> a
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])
>>> b
array([[ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.]])
>>> c
array([[1, 1, 1, 1, 1, 0],
       [1, 1, 1, 1, 1, 0],
       [1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0]])

The only way I can think of doing this is appending, however this seems pretty ugly. is there a cleaner solution possibly using b.shape?

我能想到的唯一方法是追加,但这看起来很丑陋。是否有可能使用更清洁的解决方案b.shape

Edit, Thank you to MSeiferts answer. I had to clean it up a bit, and this is what I got:

编辑,感谢 MSeiferts 的回答。我不得不清理一下,这就是我得到的:

def pad(array, reference_shape, offsets):
    """
    array: Array to be padded
    reference_shape: tuple of size of ndarray to create
    offsets: list of offsets (number of elements must be equal to the dimension of the array)
    will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
    """

    # Create an array of zeros with the reference shape
    result = np.zeros(reference_shape)
    # Create a list of slices from offset to offset + shape in each dimension
    insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
    # Insert the array in the result at the specified offsets
    result[insertHere] = array
    return result

回答by MSeifert

NumPy 1.7.0 (when numpy.padwas added) is pretty old now (it was released in 2013) so even though the question asked for a way without usingthat function I thought it could be useful to know how that could be achieved using numpy.pad.

NumPy 1.7.0(numpy.pad添加时)现在已经很老了(它于 2013 年发布)所以即使问题要求一种不使用该函数的方法,我认为知道如何使用numpy.pad.

It's actually pretty simple:

其实很简单:

>>> import numpy as np
>>> a = np.array([[ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.]])
>>> np.pad(a, [(0, 1), (0, 1)], mode='constant')
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In this case I used that 0is the default value for mode='constant'. But it could also be specified by passing it in explicitly:

在这种情况下,我使用的0是默认值mode='constant'。但也可以通过显式传入来指定它:

>>> np.pad(a, [(0, 1), (0, 1)], mode='constant', constant_values=0)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

Just in case the second argument ([(0, 1), (0, 1)]) seems confusing: Each list item (in this case tuple) corresponds to a dimension and item therein represents the padding before(first element) and after(second element). So:

以防万一第二个参数 ( [(0, 1), (0, 1)]) 看起来令人困惑:每个列表项(在本例中为元组)对应一个维度,其中的项表示之前(第一个元素)和之后(第二个元素)的填充。所以:

[(0, 1), (0, 1)]
         ^^^^^^------ padding for second dimension
 ^^^^^^-------------- padding for first dimension

  ^------------------ no padding at the beginning of the first axis
     ^--------------- pad with one "value" at the end of the first axis.

In this case the padding for the first and second axis are identical, so one could also just pass in the 2-tuple:

在这种情况下,第一个轴和第二个轴的填充是相同的,因此也可以只传入 2 元组:

>>> np.pad(a, (0, 1), mode='constant')
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In case the padding before and after is identical one could even omit the tuple (not applicable in this case though):

如果前后填充相同,甚至可以省略元组(但在这种情况下不适用):

>>> np.pad(a, 1, mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Or if the padding before and after is identical but different for the axis, you could also omit the second argument in the inner tuples:

或者,如果前后填充相同但轴不同,您也可以省略内部元组中的第二个参数:

>>> np.pad(a, [(1, ), (2, )], mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

However I tend to prefer to always use the explicit one, because it's just to easy to make mistakes (when NumPys expectations differ from your intentions):

但是我倾向于总是使用显式的,因为它很容易出错(当 NumPys 期望与您的意图不同时):

>>> np.pad(a, [1, 2], mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Here NumPy thinks you wanted to pad all axis with 1 element before and 2 elements after each axis! Even if you intended it to pad with 1 element in axis 1 and 2 elements for axis 2.

在这里 NumPy 认为您想在每个轴之前用 1 个元素和在每个轴之后用 2 个元素填充所有轴!即使您打算在轴 1 中填充 1 个元素,在轴 2 中填充 2 个元素。

I used lists of tuples for the padding, note that this is just "my convention", you could also use lists of lists or tuples of tuples, or even tuples of arrays. NumPy just checks the length of the argument (or if it doesn't have a length) and the length of each item (or if it has a length)!

我使用元组列表进行填充,请注意,这只是“我的约定”,您还可以使用列表列表或元组元组,甚至是数组元组。NumPy 只是检查参数的长度(或者它是否没有长度)和每个项目的长度(或者它是否有长度)!

回答by MSeifert

Very simple, you create an array containing zeros using the reference shape:

非常简单,您可以使用参考形状创建一个包含零的数组:

result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b) 
# but that also copies the dtype not only the shape

and then insert the array where you need it:

然后在需要的地方插入数组:

result[:a.shape[0],:a.shape[1]] = a

and voila you have padded it:

瞧,你已经填充了它:

print(result)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])


You can also make it a bit more general if you define where your upper left element should be inserted

如果您定义应插入左上角元素的位置,您也可以使其更通用

result = np.zeros_like(b)
x_offset = 1  # 0 would be what you wanted
y_offset = 1  # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result

array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.]])

but then be careful that you don't have offsets bigger than allowed. For x_offset = 2for example this will fail.

但要小心,你没有比允许的偏移量大。例如x_offset = 2,这将失败。



If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I've found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.

如果您有任意数量的维度,您可以定义一个切片列表来插入原始数组。我发现稍微玩一下很有趣,并创建了一个填充函数,只要数组和引用具有相同的维数并且偏移量不是太大,它就可以填充(带有偏移量)任意形状的数组。

def pad(array, reference, offsets):
    """
    array: Array to be padded
    reference: Reference array with the desired shape
    offsets: list of offsets (number of elements must be equal to the dimension of the array)
    """
    # Create an array of zeros with the reference shape
    result = np.zeros(reference.shape)
    # Create a list of slices from offset to offset + shape in each dimension
    insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
    # Insert the array in the result at the specified offsets
    result[insertHere] = a
    return result

And some test cases:

还有一些测试用例:

import numpy as np

# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)

# 3 Dimensions

a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)

回答by Juan Leni

I understand that your main problem is that you need to calculate d=b-abut your arrays have different sizes. There is no need for an intermediate padded c

我知道您的主要问题是您需要计算d=b-a但您的数组有不同的大小。不需要中间垫c

You can solve this without padding:

您无需填充即可解决此问题:

import numpy as np

a = np.array([[ 1.,  1.,  1.,  1.,  1.],
              [ 1.,  1.,  1.,  1.,  1.],
              [ 1.,  1.,  1.,  1.,  1.]])

b = np.array([[ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.]])

d = b.copy()
d[:a.shape[0],:a.shape[1]] -=  a

print d

Output:

输出:

[[ 2.  2.  2.  2.  2.  3.]
 [ 2.  2.  2.  2.  2.  3.]
 [ 2.  2.  2.  2.  2.  3.]
 [ 3.  3.  3.  3.  3.  3.]]

回答by Juan Leni

In case you need to add a fence of 1s to an array:

如果您需要向数组添加 1s 的栅栏:

>>> mat = np.zeros((4,4), np.int32)
>>> mat
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])
>>> mat[0,:] = mat[:,0] = mat[:,-1] =  mat[-1,:] = 1
>>> mat
array([[1, 1, 1, 1],
       [1, 0, 0, 1],
       [1, 0, 0, 1],
       [1, 1, 1, 1]])