Python tensorflow 的 tf.nn.max_pool 中的“SAME”和“VALID”填充有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37674306/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?
提问by karl_TUM
What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool
of tensorflow
?
什么是“相同”和“有效”填充之间的区别tf.nn.max_pool
的tensorflow
?
In my opinion, 'VALID' means there will be no zero padding outside the edges when we do max pool.
在我看来,'VALID' 意味着当我们做最大池时,边缘外不会有零填充。
According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use 'VALID' of tensorflow
.
But what is 'SAME' padding of max pool in tensorflow
?
根据深度学习卷积算法指南,它说池运算符中没有填充,即只使用 'VALID' tensorflow
。但是什么是最大池中的“相同”填充tensorflow
?
采纳答案by Olivier Moindrot
I'll give an example to make it clearer:
我会举一个例子来更清楚地说明:
x
: input image of shape [2, 3], 1 channelvalid_pad
: max pool with 2x2 kernel, stride 2 and VALID padding.same_pad
: max pool with 2x2 kernel, stride 2 and SAME padding (this is the classicway to go)
x
: 形状为 [2, 3] 的输入图像,1 个通道valid_pad
:具有 2x2 内核、步长 2 和有效填充的最大池。same_pad
: max pool with 2x2 kernel, stride 2 and SAME padding (这是经典的方法)
The output shapes are:
输出形状为:
valid_pad
: here, no padding so the output shape is [1, 1]same_pad
: here, we pad the image to the shape [2, 4] (with-inf
and then apply max pool), so the output shape is [1, 2]
valid_pad
:这里没有填充所以输出形状是 [1, 1]same_pad
:在这里,我们将图像填充到形状 [2, 4](使用-inf
然后应用最大池),因此输出形状为 [1, 2]
x = tf.constant([[1., 2., 3.],
[4., 5., 6.]])
x = tf.reshape(x, [1, 2, 3, 1]) # give a shape accepted by tf.nn.max_pool
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
valid_pad.get_shape() == [1, 1, 1, 1] # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1] # same_pad is [5., 6.]
回答by MiniQuark
If you like ascii art:
如果你喜欢 ascii 艺术:
"VALID"
= without padding:inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) |________________| dropped |_________________|
"SAME"
= with zero padding:pad| |pad inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0 |________________| |_________________| |________________|
"VALID"
= 无填充:inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) |________________| dropped |_________________|
"SAME"
= 零填充:pad| |pad inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0 |________________| |_________________| |________________|
In this example:
在这个例子中:
- Input width = 13
- Filter width = 6
- Stride = 5
- 输入宽度 = 13
- 过滤器宽度 = 6
- 步幅 = 5
Notes:
笔记:
"VALID"
only ever drops the right-most columns (or bottom-most rows)."SAME"
tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).
"VALID"
只删除最右边的列(或最底部的行)。"SAME"
尝试均匀地左右填充,但如果要添加的列数是奇数,则会在右侧添加额外的列,就像本示例中的情况一样(垂直应用相同的逻辑:可能会有额外的行底部的零)。
Edit:
编辑:
About the name:
关于名称:
- With
"SAME"
padding, if you use a stride of 1, the layer's outputs will have the samespatial dimensions as its inputs. - With
"VALID"
padding, there's no "made-up" padding inputs. The layer only uses validinput data.
- 使用
"SAME"
填充,如果您使用 1 的步幅,则层的输出将具有与其输入相同的空间维度。 - 使用
"VALID"
填充,没有“编造”的填充输入。该层仅使用有效的输入数据。
回答by YvesgereY
When stride
is 1 (more typical with convolution than pooling), we can think of the following distinction:
当stride
为 1 时(卷积比池化更典型),我们可以想到以下区别:
"SAME"
: output size is the sameas input size. This requires the filter window to slip outside input map, hence the need to pad."VALID"
: Filter window stays at validposition inside input map, so output size shrinks byfilter_size - 1
. No padding occurs.
"SAME"
: 输出大小与输入大小相同。这需要过滤器窗口滑出输入图,因此需要填充。"VALID"
:过滤器窗口停留在输入地图内的有效位置,因此输出尺寸缩小filter_size - 1
. 没有填充发生。
回答by RoyaumeIX
The TensorFlow Convolutionexample gives an overview about the difference between SAME
and VALID
:
所述TensorFlow卷积示例给出关于之间的差的概述SAME
和VALID
:
For the
SAME
padding, the output height and width are computed as:out_height = ceil(float(in_height) / float(strides[1])) out_width = ceil(float(in_width) / float(strides[2]))
对于
SAME
填充,输出高度和宽度计算如下:out_height = ceil(float(in_height) / float(strides[1])) out_width = ceil(float(in_width) / float(strides[2]))
And
和
For the
VALID
padding, the output height and width are computed as:out_height = ceil(float(in_height - filter_height + 1) / float(strides[1])) out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
对于
VALID
填充,输出高度和宽度计算如下:out_height = ceil(float(in_height - filter_height + 1) / float(strides[1])) out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
回答by Salvador Dali
Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.
填充是一种增加输入数据大小的操作。如果是一维数据,您只需在数组中附加/预先添加一个常量,在 2-dim 中,您可以用这些常量包围矩阵。在 n-dim 中,你用常数包围你的 n-dim 超立方体。在大多数情况下,这个常数为零,称为零填充。
Here is an example of zero-padding with p=1
applied to 2-d tensor:
You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:
您可以为内核使用任意填充,但某些填充值的使用频率高于其他值:
- VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
- SAME paddingsometimes called HALF padding. It is called SAMEbecause for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALFbecause for a kernel of size
k
- FULL paddingis the maximum padding which does not result in a convolution over just padded elements. For a kernel of size
k
, this padding is equal tok - 1
.
- 有效的填充。最简单的情况,意味着根本没有填充。只需让您的数据保持原样即可。
- SAME padding有时称为HALF padding。之所以称为SAME,是因为对于 stride=1 的卷积(或池化),它应该产生与输入相同大小的输出。它被称为HALF因为对于大小的内核
k
- FULL padding是最大填充,它不会导致仅填充元素的卷积。对于大小为 的内核
k
,此填充等于k - 1
。
To use arbitrary padding in TF, you can use tf.pad()
要在 TF 中使用任意填充,您可以使用 tf.pad()
回答by Shital Shah
Quick Explanation
快速说明
VALID
: Don't apply any padding, i.e., assume that all dimensions are validso that input image fully gets covered by filter and stride you specified.
VALID
:不要应用任何填充,即假设所有维度都有效,以便输入图像完全被过滤器和您指定的步幅覆盖。
SAME
: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is sameas input.
SAME
:对输入应用填充(如果需要),以便输入图像完全被过滤器和您指定的步幅覆盖。对于步长 1,这将确保输出图像大小与输入相同。
Notes
笔记
- This applies to conv layers as well as max pool layers in same way
- The term "valid" is bit of a misnomer because things don't become "invalid" if you drop part of the image. Sometime you might even want that. This should have probably be called
NO_PADDING
instead. - The term "same" is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called
AUTO_PADDING
instead. - In
SAME
(i.e. auto-pad mode), Tensorflow will try to spread padding evenly on both left and right. - In
VALID
(i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn't full cover input image.
- 这同样适用于卷积层和最大池层
- 术语“有效”有点用词不当,因为如果您删除部分图像,事情不会变得“无效”。有时你甚至可能想要那个。这可能应该被调用
NO_PADDING
。 - 术语“相同”也用词不当,因为当输出维度与输入维度相同时,它仅对步幅为 1 有意义。例如,对于 2 步长,输出维度将是一半。这可能应该被调用
AUTO_PADDING
。 - 在
SAME
(即自动填充模式)下,Tensorflow 将尝试在左右两侧均匀分布填充。 - 在
VALID
(即无填充模式)中,如果您的过滤器和步幅没有完全覆盖输入图像,Tensorflow 将丢弃右侧和/或底部单元格。
回答by Vaibhav Dixit
I am quoting this answer from official tensorflow docs https://www.tensorflow.org/api_guides/python/nn#ConvolutionFor the 'SAME' padding, the output height and width are computed as:
我从官方 tensorflow 文档https://www.tensorflow.org/api_guides/python/nn#Convolution引用这个答案 对于“相同”填充,输出高度和宽度计算如下:
out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))
and the padding on the top and left are computed as:
顶部和左侧的填充计算如下:
pad_along_height = max((out_height - 1) * strides[1] +
filter_height - in_height, 0)
pad_along_width = max((out_width - 1) * strides[2] +
filter_width - in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left
For the 'VALID' padding, the output height and width are computed as:
对于“VALID”填充,输出高度和宽度计算如下:
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
and the padding values are always zero.
并且填充值始终为零。
回答by Change-the-world
There are three choices of padding: valid (no padding), same (or half), full. You can find explanations (in Theano) here: http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html
填充有三种选择:有效(无填充)、相同(或一半)、完整。你可以在这里找到解释(在 Theano 中):http: //deeplearning.net/software/theano/tutorial/conv_arithmetic.html
- Valid or no padding:
- 有效或无填充:
The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) - (k-1)) for the kernel size k if the stride s=1.
有效填充不涉及零填充,因此它仅涵盖有效输入,不包括人工生成的零。如果步长 s=1,则对于内核大小 k,输出的长度为 ((输入的长度) - (k-1))。
- Same or half padding:
- 相同或半填充:
The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k-1).
当 s=1 时,相同的填充使得输出的大小与输入的大小相同。如果 s=1,则填充的零数为 (k-1)。
- Full padding:
- 全填充:
The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k-1) if s=1. The length of output is ((the length of input) + (k-1)) if s=1.
全填充意味着内核在整个输入上运行,因此在最后,内核可能只遇到一个输入,其他输入为零。如果 s=1,则填充的零数为 2(k-1)。如果 s=1,则输出的长度为 ((输入的长度) + (k-1))。
Therefore, the number of paddings: (valid) <= (same) <= (full)
因此,填充的数量:(有效)<=(相同)<=(完整)
回答by GPrathap
VALIDpadding: this is with zero padding. Hope there is no confusion.
有效填充:这是零填充。希望没有混淆。
x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output-->(1, 2, 1, 1)
SAMEpadding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.
SAMEpadding:首先这有点难以理解,因为我们必须分别考虑官方文档中提到的两个条件。
Let's take input as , output as
, padding as
, stride as
and kernel size as
(only a single dimension is considered)
让我们将输入为,输出为
,填充为
,步长为
和内核大小为
(仅考虑单个维度)
Case 01: :
案例 01: :
Case 02: :
案例 02: :
is calculated such that the minimum value which can be taken for padding. Since value of
is known, value of
can be found using this formula
.
计算使得可以用于填充的最小值。由于 的值
是已知的,
因此可以使用此公式找到 的值
。
Let's work out this example:
让我们来看看这个例子:
x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # --> output (1, 2, 2, 1)
Here the dimension of x is (3,4). Then if the horizontal direction is taken (3):
这里 x 的维度是 (3,4)。那么如果取水平方向(3):
If the vertial direction is taken (4):
如果采用垂直方向(4):
Hope this will help to understand how actually SAMEpadding works in TF.
希望这将有助于了解SAME填充在 TF 中的实际工作原理。
回答by Laine Mikael
Padding on/off. Determines the effective size of your input.
填充开/关。确定输入的有效大小。
VALID:
No padding. Convolution etc. ops are only performed at locations that are "valid", i.e. not too close to the borders of your tensor.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the 8x8 area inside the borders.
VALID:
没有填充。卷积等操作仅在“有效”的位置执行,即不太靠近张量的边界。
使用 3x3 内核和 10x10 图像,您将在边界内的 8x8 区域上执行卷积。
SAME:
Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the full 10x10 area.
SAME:
提供填充。每当您的操作引用一个邻域(无论多大)时,当该邻域扩展到原始张量之外时都会提供零值,以允许该操作也适用于边界值。
使用 3x3 内核和 10x10 图像,您将在整个 10x10 区域上执行卷积。