Python 如何使用 numpy 在二维数组上执行最大/平均池化
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42463172/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to perform max/mean pooling on a 2d array using numpy
提问by rapidclock
Given a 2D(M x N) matrix, and a 2D Kernel(K x L), how do i return a matrix that is the result of max or mean pooling using the given kernel over the image?
给定一个 2D(M x N) 矩阵和一个 2D Kernel(K x L),我如何返回一个矩阵,该矩阵是使用给定内核对图像进行最大或平均池化的结果?
I'd like to use numpy if possible.
如果可能,我想使用 numpy。
Note: M, N, K, L can be both even or odd and they need not be perfectly divisible by each other, eg: 7x5 matrix and 2x2 kernel.
注意:M、N、K、L 既可以是偶数也可以是奇数,并且它们不需要完全可以被彼此整除,例如:7x5 矩阵和 2x2 核。
eg of max pooling:
例如最大池化:
matrix:
array([[ 20, 200, -5, 23],
[ -13, 134, 119, 100],
[ 120, 32, 49, 25],
[-120, 12, 09, 23]])
kernel: 2 x 2
soln:
array([[ 200, 119],
[ 120, 49]])
回答by mdh
You could use scikit-image block_reduce:
您可以使用 scikit-image block_reduce:
import numpy as np
import skimage.measure
a = np.array([
[ 20, 200, -5, 23],
[ -13, 134, 119, 100],
[ 120, 32, 49, 25],
[-120, 12, 9, 23]
])
skimage.measure.block_reduce(a, (2,2), np.max)
Gives:
给出:
array([[200, 119],
[120, 49]])
回答by Elliot
If the image size is evenly divisible by the kernal size, you can reshape the array and use max
or mean
as you see fit
如果图像大小可以被内核大小整除,您可以重新调整数组的形状并使用max
或mean
按照您认为合适的方式使用
import numpy as np
mat = np.array([[ 20, 200, -5, 23],
[ -13, 134, 119, 100],
[ 120, 32, 49, 25],
[-120, 12, 9, 23]])
M, N = mat.shape
K = 2
L = 2
MK = M // K
NL = N // L
print(mat[:MK*K, :NL*L].reshape(MK, K, NL, L).max(axis=(1, 3)))
# [[200, 119], [120, 49]]
If you don't have an even number of kernels, you'll have to handle the boundaries separately. (As pointed out in the comments, this results in the matrix being copied, which will affect performance).
如果您没有偶数个内核,则必须单独处理边界。(正如评论中所指出的,这会导致矩阵被复制,这会影响性能)。
mat = np.array([[20, 200, -5, 23, 7],
[-13, 134, 119, 100, 8],
[120, 32, 49, 25, 12],
[-120, 12, 9, 23, 15],
[-57, 84, 19, 17, 82],
])
# soln
# [200, 119, 8]
# [120, 49, 15]
# [84, 19, 82]
M, N = mat.shape
K = 2
L = 2
MK = M // K
NL = N // L
# split the matrix into 'quadrants'
Q1 = mat[:MK * K, :NL * L].reshape(MK, K, NL, L).max(axis=(1, 3))
Q2 = mat[MK * K:, :NL * L].reshape(-1, NL, L).max(axis=2)
Q3 = mat[:MK * K, NL * L:].reshape(MK, K, -1).max(axis=1)
Q4 = mat[MK * K:, NL * L:].max()
# compose the individual quadrants into one new matrix
soln = np.vstack([np.c_[Q1, Q3], np.c_[Q2, Q4]])
print(soln)
# [[200 119 8]
# [120 49 15]
# [ 84 19 82]]
回答by Jason
Instead of making "quadrants" as shown by Elliot's answer, we could pad it to make it evenly divisible, then perform either max or mean pooling.
不是像 Elliot 的回答那样制作“象限”,我们可以填充它以使其均匀可分,然后执行最大或平均池化。
As pooling is often used in CNN, the input array is usually 3D. So I made a function that works on either 2D or 3D arrays.
由于 CNN 中经常使用池化,因此输入数组通常是 3D 的。所以我制作了一个适用于 2D 或 3D 数组的函数。
def pooling(mat,ksize,method='max',pad=False):
'''Non-overlapping pooling on 2D or 3D data.
<mat>: ndarray, input array to pool.
<ksize>: tuple of 2, kernel size in (ky, kx).
<method>: str, 'max for max-pooling,
'mean' for mean-pooling.
<pad>: bool, pad <mat> or not. If no pad, output has size
n//f, n being <mat> size, f being kernel size.
if pad, output has size ceil(n/f).
Return <result>: pooled matrix.
'''
m, n = mat.shape[:2]
ky,kx=ksize
_ceil=lambda x,y: int(numpy.ceil(x/float(y)))
if pad:
ny=_ceil(m,ky)
nx=_ceil(n,kx)
size=(ny*ky, nx*kx)+mat.shape[2:]
mat_pad=numpy.full(size,numpy.nan)
mat_pad[:m,:n,...]=mat
else:
ny=m//ky
nx=n//kx
mat_pad=mat[:ny*ky, :nx*kx, ...]
new_shape=(ny,ky,nx,kx)+mat.shape[2:]
if method=='max':
result=numpy.nanmax(mat_pad.reshape(new_shape),axis=(1,3))
else:
result=numpy.nanmean(mat_pad.reshape(new_shape),axis=(1,3))
return result
Sometimes you may want to perform overlapping pooling, at a stride not equal to the kernel size. Here is a function that does that, with or without padding:
有时您可能希望以不等于内核大小的步幅执行重叠池化。这是一个可以做到这一点的函数,有或没有填充:
def asStride(arr,sub_shape,stride):
'''Get a strided sub-matrices view of an ndarray.
See also skimage.util.shape.view_as_windows()
'''
s0,s1=arr.strides[:2]
m1,n1=arr.shape[:2]
m2,n2=sub_shape
view_shape=(1+(m1-m2)//stride[0],1+(n1-n2)//stride[1],m2,n2)+arr.shape[2:]
strides=(stride[0]*s0,stride[1]*s1,s0,s1)+arr.strides[2:]
subs=numpy.lib.stride_tricks.as_strided(arr,view_shape,strides=strides)
return subs
def poolingOverlap(mat,ksize,stride=None,method='max',pad=False):
'''Overlapping pooling on 2D or 3D data.
<mat>: ndarray, input array to pool.
<ksize>: tuple of 2, kernel size in (ky, kx).
<stride>: tuple of 2 or None, stride of pooling window.
If None, same as <ksize> (non-overlapping pooling).
<method>: str, 'max for max-pooling,
'mean' for mean-pooling.
<pad>: bool, pad <mat> or not. If no pad, output has size
(n-f)//s+1, n being <mat> size, f being kernel size, s stride.
if pad, output has size ceil(n/s).
Return <result>: pooled matrix.
'''
m, n = mat.shape[:2]
ky,kx=ksize
if stride is None:
stride=(ky,kx)
sy,sx=stride
_ceil=lambda x,y: int(numpy.ceil(x/float(y)))
if pad:
ny=_ceil(m,sy)
nx=_ceil(n,sx)
size=((ny-1)*sy+ky, (nx-1)*sx+kx) + mat.shape[2:]
mat_pad=numpy.full(size,numpy.nan)
mat_pad[:m,:n,...]=mat
else:
mat_pad=mat[:(m-ky)//sy*sy+ky, :(n-kx)//sx*sx+kx, ...]
view=asStride(mat_pad,ksize,stride)
if method=='max':
result=numpy.nanmax(view,axis=(2,3))
else:
result=numpy.nanmean(view,axis=(2,3))
return result
回答by gebbissimo
Since the numpy documentation says to use "numpy.lib.stride_tricks.as_strided" with "extreme care", here is another solution for a 2D/3D pooling without it.
由于 numpy 文档说使用“numpy.lib.stride_tricks.as_strided”和“极度小心”,这里是另一种没有它的 2D/3D 池的解决方案。
If strides=1, it results in using same padding. For strides>1, I am not 100% sure about how same padding is defined...
如果 strides=1,则导致使用相同的填充。对于 strides>1,我不是 100% 确定如何定义相同的填充......
def pool3D(arr,
kernel=(2, 2, 2),
stride=(1, 1, 1),
func=np.nanmax,
):
# check inputs
assert arr.ndim == 3
assert len(kernel) == 3
# create array with lots of padding around it, from which we grab stuff (could be more efficient, yes)
arr_padded_shape = arr.shape + 2 * np.array(kernel)
arr_padded = np.zeros(arr_padded_shape, dtype=arr.dtype) * np.nan
arr_padded[
kernel[0]:kernel[0] + arr.shape[0],
kernel[1]:kernel[1] + arr.shape[1],
kernel[2]:kernel[2] + arr.shape[2],
] = arr
# create temporary array, which aggregates kernel elements in last axis
size_x = 1 + (arr.shape[0]-1) // stride[0]
size_y = 1 + (arr.shape[1]-1) // stride[1]
size_z = 1 + (arr.shape[2]-1) // stride[2]
size_kernel = np.prod(kernel)
arr_tmp = np.empty((size_x, size_y, size_z, size_kernel), dtype=arr.dtype)
# fill temporary array
kx_center = (kernel[0] - 1) // 2
ky_center = (kernel[1] - 1) // 2
kz_center = (kernel[2] - 1) // 2
idx_kernel = 0
for kx in range(kernel[0]):
dx = kernel[0] + kx - kx_center
for ky in range(kernel[1]):
dy = kernel[1] + ky - ky_center
for kz in range(kernel[2]):
dz = kernel[2] + kz - kz_center
arr_tmp[:, :, :, idx_kernel] = arr_padded[
dx:dx + arr.shape[0]:stride[0],
dy:dy + arr.shape[1]:stride[1],
dz:dz + arr.shape[2]:stride[2],
]
idx_kernel += 1
# perform pool function
arr_final = func(arr_tmp, axis=-1)
return arr_final
def pool2D(arr,
kernel=(2, 2),
stride=(1, 1),
func=np.nanmax,
):
# check inputs
assert arr.ndim == 2
assert len(kernel) == 2
# transform into 3D array with empty dimension?
arr3D = arr[..., np.newaxis]
kernel3D = kernel + (1,)
stride3D = stride + (1,)
arr3D_final = pool3D(arr3D, kernel3D, stride3D, func)
arr2D_final = arr3D_final[:, :, 0]
return arr2D_final