Python tensorflow 的 tf.nn.max_pool 中的“SAME”和“VALID”填充有什么区别？

Question

提问by karl_TUM

What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_poolof tensorflow?

什么是“相同”和“有效”填充之间的区别tf.nn.max_pool的tensorflow？

In my opinion, 'VALID' means there will be no zero padding outside the edges when we do max pool.

在我看来，'VALID' 意味着当我们做最大池时，边缘外不会有零填充。

According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use 'VALID' of tensorflow. But what is 'SAME' padding of max pool in tensorflow?

根据深度学习卷积算法指南，它说池运算符中没有填充，即只使用 'VALID' tensorflow。但是什么是最大池中的“相同”填充tensorflow？

Answer 1

采纳答案by Olivier Moindrot

I'll give an example to make it clearer:

我会举一个例子来更清楚地说明：

x: input image of shape [2, 3], 1 channel
valid_pad: max pool with 2x2 kernel, stride 2 and VALID padding.
same_pad: max pool with 2x2 kernel, stride 2 and SAME padding (this is the classicway to go)

x: 形状为 [2, 3] 的输入图像，1 个通道
valid_pad：具有 2x2 内核、步长 2 和有效填充的最大池。
same_pad: max pool with 2x2 kernel, stride 2 and SAME padding (这是经典的方法)

The output shapes are:

输出形状为：

valid_pad: here, no padding so the output shape is [1, 1]
same_pad: here, we pad the image to the shape [2, 4] (with -infand then apply max pool), so the output shape is [1, 2]

valid_pad：这里没有填充所以输出形状是 [1, 1]
same_pad：在这里，我们将图像填充到形状 [2, 4]（使用-inf然后应用最大池），因此输出形状为 [1, 2]

x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.]])

x = tf.reshape(x, [1, 2, 3, 1])  # give a shape accepted by tf.nn.max_pool

valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')

valid_pad.get_shape() == [1, 1, 1, 1]  # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1]   # same_pad is  [5., 6.]

Answer 2

回答by MiniQuark

If you like ascii art:

如果你喜欢 ascii 艺术：

"VALID"= without padding:

   inputs:         1  2  3  4  5  6  7  8  9  10 11 (12 13)
                  |________________|                dropped
                                 |_________________|

"SAME"= with zero padding:

               pad|                                      |pad
   inputs:      0 |1  2  3  4  5  6  7  8  9  10 11 12 13|0  0
               |________________|
                              |_________________|
                                             |________________|

"VALID"= 无填充：

   inputs:         1  2  3  4  5  6  7  8  9  10 11 (12 13)
                  |________________|                dropped
                                 |_________________|

"SAME"= 零填充：

               pad|                                      |pad
   inputs:      0 |1  2  3  4  5  6  7  8  9  10 11 12 13|0  0
               |________________|
                              |_________________|
                                             |________________|

In this example:

在这个例子中：

Input width = 13
Filter width = 6
Stride = 5

输入宽度 = 13
过滤器宽度 = 6
步幅 = 5

Notes:

笔记：

"VALID"only ever drops the right-most columns (or bottom-most rows).
"SAME"tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).

"VALID"只删除最右边的列（或最底部的行）。
"SAME"尝试均匀地左右填充，但如果要添加的列数是奇数，则会在右侧添加额外的列，就像本示例中的情况一样（垂直应用相同的逻辑：可能会有额外的行底部的零）。

Edit:

编辑：

About the name:

关于名称：

With "SAME"padding, if you use a stride of 1, the layer's outputs will have the samespatial dimensions as its inputs.
With "VALID"padding, there's no "made-up" padding inputs. The layer only uses validinput data.

使用"SAME"填充，如果您使用 1 的步幅，则层的输出将具有与其输入相同的空间维度。
使用"VALID"填充，没有“编造”的填充输入。该层仅使用有效的输入数据。

Answer 3

回答by YvesgereY

When strideis 1 (more typical with convolution than pooling), we can think of the following distinction:

当stride为 1 时（卷积比池化更典型），我们可以想到以下区别：

"SAME": output size is the sameas input size. This requires the filter window to slip outside input map, hence the need to pad.
"VALID": Filter window stays at validposition inside input map, so output size shrinks by filter_size - 1. No padding occurs.

"SAME": 输出大小与输入大小相同。这需要过滤器窗口滑出输入图，因此需要填充。
"VALID"：过滤器窗口停留在输入地图内的有效位置，因此输出尺寸缩小filter_size - 1. 没有填充发生。

Answer 4

回答by RoyaumeIX

The TensorFlow Convolutionexample gives an overview about the difference between SAMEand VALID:

所述TensorFlow卷积示例给出关于之间的差的概述SAME和VALID：

For the SAMEpadding, the output height and width are computed as:

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

对于SAME填充，输出高度和宽度计算如下：

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

And

和

For the VALIDpadding, the output height and width are computed as:

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

对于VALID填充，输出高度和宽度计算如下：

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

Answer 5

回答by Salvador Dali

Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.

填充是一种增加输入数据大小的操作。如果是一维数据，您只需在数组中附加/预先添加一个常量，在 2-dim 中，您可以用这些常量包围矩阵。在 n-dim 中，你用常数包围你的 n-dim 超立方体。在大多数情况下，这个常数为零，称为零填充。

Here is an example of zero-padding with p=1applied to 2-d tensor:

这是一个p=1应用于二维张量的零填充示例：

You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:

您可以为内核使用任意填充，但某些填充值的使用频率高于其他值：

VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
SAME paddingsometimes called HALF padding. It is called SAMEbecause for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALFbecause for a kernel of size k
FULL paddingis the maximum padding which does not result in a convolution over just padded elements. For a kernel of size k, this padding is equal to k - 1.

有效的填充。最简单的情况，意味着根本没有填充。只需让您的数据保持原样即可。
SAME padding有时称为HALF padding。之所以称为SAME，是因为对于 stride=1 的卷积（或池化），它应该产生与输入相同大小的输出。它被称为HALF因为对于大小的内核k
FULL padding是最大填充，它不会导致仅填充元素的卷积。对于大小为的内核k，此填充等于k - 1。

To use arbitrary padding in TF, you can use tf.pad()

要在 TF 中使用任意填充，您可以使用 tf.pad()

Answer 6

回答by Shital Shah

Quick Explanation

快速说明

VALID: Don't apply any padding, i.e., assume that all dimensions are validso that input image fully gets covered by filter and stride you specified.

VALID：不要应用任何填充，即假设所有维度都有效，以便输入图像完全被过滤器和您指定的步幅覆盖。

SAME: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is sameas input.

SAME：对输入应用填充（如果需要），以便输入图像完全被过滤器和您指定的步幅覆盖。对于步长 1，这将确保输出图像大小与输入相同。

Notes

笔记

This applies to conv layers as well as max pool layers in same way
The term "valid" is bit of a misnomer because things don't become "invalid" if you drop part of the image. Sometime you might even want that. This should have probably be called NO_PADDINGinstead.
The term "same" is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called AUTO_PADDINGinstead.
In SAME(i.e. auto-pad mode), Tensorflow will try to spread padding evenly on both left and right.
In VALID(i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn't full cover input image.

这同样适用于卷积层和最大池层
术语“有效”有点用词不当，因为如果您删除部分图像，事情不会变得“无效”。有时你甚至可能想要那个。这可能应该被调用NO_PADDING。
术语“相同”也用词不当，因为当输出维度与输入维度相同时，它仅对步幅为 1 有意义。例如，对于 2 步长，输出维度将是一半。这可能应该被调用AUTO_PADDING。
在SAME（即自动填充模式）下，Tensorflow 将尝试在左右两侧均匀分布填充。
在VALID（即无填充模式）中，如果您的过滤器和步幅没有完全覆盖输入图像，Tensorflow 将丢弃右侧和/或底部单元格。

Answer 7

回答by Vaibhav Dixit

I am quoting this answer from official tensorflow docs https://www.tensorflow.org/api_guides/python/nn#ConvolutionFor the 'SAME' padding, the output height and width are computed as:

我从官方 tensorflow 文档https://www.tensorflow.org/api_guides/python/nn#Convolution引用这个答案对于“相同”填充，输出高度和宽度计算如下：

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

and the padding on the top and left are computed as:

顶部和左侧的填充计算如下：

pad_along_height = max((out_height - 1) * strides[1] +
                    filter_height - in_height, 0)
pad_along_width = max((out_width - 1) * strides[2] +
                   filter_width - in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

For the 'VALID' padding, the output height and width are computed as:

对于“VALID”填充，输出高度和宽度计算如下：

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

and the padding values are always zero.

并且填充值始终为零。

Answer 8

回答by Change-the-world

There are three choices of padding: valid (no padding), same (or half), full. You can find explanations (in Theano) here: http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html

填充有三种选择：有效（无填充）、相同（或一半）、完整。你可以在这里找到解释（在 Theano 中）：http: //deeplearning.net/software/theano/tutorial/conv_arithmetic.html

Valid or no padding:

有效或无填充：

The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) - (k-1)) for the kernel size k if the stride s=1.

有效填充不涉及零填充，因此它仅涵盖有效输入，不包括人工生成的零。如果步长 s=1，则对于内核大小 k，输出的长度为 ((输入的长度) - (k-1))。

Same or half padding:

相同或半填充：

The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k-1).

当 s=1 时，相同的填充使得输出的大小与输入的大小相同。如果 s=1，则填充的零数为 (k-1)。

Full padding:

全填充：

The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k-1) if s=1. The length of output is ((the length of input) + (k-1)) if s=1.

全填充意味着内核在整个输入上运行，因此在最后，内核可能只遇到一个输入，其他输入为零。如果 s=1，则填充的零数为 2(k-1)。如果 s=1，则输出的长度为 ((输入的长度) + (k-1))。

Therefore, the number of paddings: (valid) <= (same) <= (full)

因此，填充的数量：（有效）<=（相同）<=（完整）

Answer 9

回答by GPrathap

VALIDpadding: this is with zero padding. Hope there is no confusion.

有效填充：这是零填充。希望没有混淆。

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output-->(1, 2, 1, 1)

SAMEpadding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.

SAMEpadding：首先这有点难以理解，因为我们必须分别考虑官方文档中提到的两个条件。

Let's take input as n_i , output as n_o , padding as p_i , stride as and kernel size as (only a single dimension is considered)

让我们将输入为，输出为，填充为 p_i ，步长为和内核大小为（仅考虑单个维度）

Case 01: $n_i \mod s = 0$ : p_i = max(k-s ,0)

案例 01: $n_i \mod s = 0$ : p_i = max(ks ,0)

Case 02: $n_i \mod s \neq 0$ : $p_i = max(k - (n_i\mod s)), 0)$

案例 02: $n_i \mod s \neq 0$ : $p_i = max(k - (n_i\mod s)), 0)$

p_i is calculated such that the minimum value which can be taken for padding. Since value of p_i is known, value of n_0 can be found using this formula (n_i - k + 2p_i)/2 + 1 = n_0 .

p_i 计算使得可以用于填充的最小值。由于的值 p_i 是已知的， n_0 因此可以使用此公式找到的值 (n_i - k + 2p_i)/2 + 1 = n_0 。

Let's work out this example:

让我们来看看这个例子：

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # --> output (1, 2, 2, 1)

Here the dimension of x is (3,4). Then if the horizontal direction is taken (3):

这里 x 的维度是 (3,4)。那么如果取水平方向（3）：

$n_i = 3, k =2, s =2, p_i = 2 - (3\mod 2) = 1, n_0 = floor (\frac{3-2+2*1}{2} + 1) = 2$

If the vertial direction is taken (4):

如果采用垂直方向（4）：

$n_i = 4, k =2, s =2, p_i = 2 - 2 = 0, n_0 = floor (\frac{3-2+2*0}{2} + 1) = 2$

Hope this will help to understand how actually SAMEpadding works in TF.

希望这将有助于了解SAME填充在 TF 中的实际工作原理。

Answer 10

回答by Laine Mikael

Padding on/off. Determines the effective size of your input.

填充开/关。确定输入的有效大小。

VALID:No padding. Convolution etc. ops are only performed at locations that are "valid", i.e. not too close to the borders of your tensor.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the 8x8 area inside the borders.

VALID:没有填充。卷积等操作仅在“有效”的位置执行，即不太靠近张量的边界。
使用 3x3 内核和 10x10 图像，您将在边界内的 8x8 区域上执行卷积。

SAME:Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the full 10x10 area.

SAME:提供填充。每当您的操作引用一个邻域（无论多大）时，当该邻域扩展到原始张量之外时都会提供零值，以允许该操作也适用于边界值。
使用 3x3 内核和 10x10 图像，您将在整个 10x10 区域上执行卷积。

Python tensorflow 的 tf.nn.max_pool 中的“SAME”和“VALID”填充有什么区别？

提问by karl_TUM

采纳答案by Olivier Moindrot

回答by MiniQuark

回答by YvesgereY

回答by RoyaumeIX

回答by Salvador Dali

回答by Shital Shah

回答by Vaibhav Dixit

回答by Change-the-world

回答by GPrathap

回答by Laine Mikael

相关推荐

最近更新

标签

Python tensorflow 的 tf.nn.max_pool 中的“SAME”和“VALID”填充有什么区别？

提问by karl_TUM

采纳答案by Olivier Moindrot

回答by MiniQuark

回答by YvesgereY

回答by RoyaumeIX

回答by Salvador Dali

回答by Shital Shah

回答by Vaibhav Dixit

回答by Change-the-world

回答by GPrathap

回答by Laine Mikael

相关推荐

Python Tensorflow：恢复图形和模型，然后对单个图像进行评估

在 Windows 10 for python 3.7 上使用 pip 安装 numpy

Python 无法导入名称包括

Python 如何确保 tensorflow 正在使用 GPU

相关推荐

最近更新

标签