Python “视图”方法在 PyTorch 中是如何工作的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42479902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:45:08  来源:igfitidea点击:

How does the "view" method work in PyTorch?

pythonmemorypytorchtorchtensor

提问by Wasi Ahmad

I am confused about the method view()in the following code snippet.

view()对以下代码片段中的方法感到困惑。

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool  = nn.MaxPool2d(2,2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1   = nn.Linear(16*5*5, 120)
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

My confusion is regarding the following line.

我的困惑是关于以下行。

x = x.view(-1, 16*5*5)

What does tensor.view()function do? I have seen its usage in many places, but I can't understand how it interprets its parameters.

tensor.view()函数有什么作用?我在很多地方都看到了它的用法,但我无法理解它是如何解释它的参数的。

What happens if I give negative values as parameters to the view()function? For example, what happens if I call, tensor_variable.view(1, 1, -1)?

如果我将负值作为view()函数的参数,会发生什么?例如,如果我调用,会发生什么tensor_variable.view(1, 1, -1)

Can anyone explain the main principle of view()function with some examples?

谁能view()用一些例子来解释函数的主要原理?

回答by Kashyap

The view function is meant to reshape the tensor.

视图函数旨在重塑张量。

Say you have a tensor

假设你有一个张量

import torch
a = torch.range(1, 16)

ais a tensor that has 16 elements from 1 to 16(included). If you want to reshape this tensor to make it a 4 x 4tensor then you can use

a是一个从 1 到 16(包括)有 16 个元素的张量。如果你想重塑这个张量以使其成为4 x 4张量,那么你可以使用

a = a.view(4, 4)

Now awill be a 4 x 4tensor. Note that after the reshape the total number of elements need to remain the same. Reshaping the tensor ato a 3 x 5tensor would not be appropriate.

现在a将是一个4 x 4张量。请注意,在重塑之后,元素的总数需要保持不变。重塑张a3 x 5张量是不恰当的。

What is the meaning of parameter -1?

参数-1是什么意思?

If there is any situation that you don't know how many rows you want but are sure of the number of columns, then you can specify this with a -1. (Note that you can extend this to tensors with more dimensions. Only one of the axis value can be -1). This is a way of telling the library: "give me a tensor that has these many columns and you compute the appropriate number of rows that is necessary to make this happen".

如果在任何情况下您不知道您想要多少行但确定列数,那么您可以使用 -1 来指定它。(请注意,您可以将此扩展到具有更多维度的张量。轴值中只有一个可以是 -1)。这是告诉库的一种方式:“给我一个具有这么多列的张量,然后计算实现这一点所需的适当行数”。

This can be seen in the neural network code that you have given above. After the line x = self.pool(F.relu(self.conv2(x)))in the forward function, you will have a 16 depth feature map. You have to flatten this to give it to the fully connected layer. So you tell pytorch to reshape the tensor you obtained to have specific number of columns and tell it to decide the number of rows by itself.

这可以在您上面给出的神经网络代码中看到。在x = self.pool(F.relu(self.conv2(x)))forward 函数中的行之后,您将拥有一个 16 深度的特征图。您必须将其展平以将其提供给完全连接的层。因此,您告诉 pytorch 将您获得的张量重塑为具有特定列数,并告诉它自行决定行数。

Drawing a similarity between numpy and pytorch, viewis similar to numpy's reshapefunction.

在 numpy 和 pytorch 之间绘制相似性,view类似于 numpy 的reshape功能。

回答by Jadiel de Armas

Let's do some examples, from simpler to more difficult.

让我们做一些例子,从简单到更难。

  1. The viewmethod returns a tensor with the same data as the selftensor (which means that the returned tensor has the same number of elements), but with a different shape. For example:

    a = torch.arange(1, 17)  # a's shape is (16,)
    
    a.view(4, 4) # output below
      1   2   3   4
      5   6   7   8
      9  10  11  12
     13  14  15  16
    [torch.FloatTensor of size 4x4]
    
    a.view(2, 2, 4) # output below
    (0 ,.,.) = 
    1   2   3   4
    5   6   7   8
    
    (1 ,.,.) = 
     9  10  11  12
    13  14  15  16
    [torch.FloatTensor of size 2x2x4]
    
  2. Assuming that -1is not one of the parameters, when you multiply them together, the result must be equal to the number of elements in the tensor. If you do: a.view(3, 3), it will raise a RuntimeErrorbecause shape (3 x 3) is invalid for input with 16 elements. In other words: 3 x 3 does not equal 16 but 9.

  3. You can use -1as one of the parameters that you pass to the function, but only once. All that happens is that the method will do the math for you on how to fill that dimension. For example a.view(2, -1, 4)is equivalent to a.view(2, 2, 4). [16 / (2 x 4) = 2]

  4. Notice that the returned tensor shares the same data. If you make a change in the "view" you are changing the original tensor's data:

    b = a.view(4, 4)
    b[0, 2] = 2
    a[2] == 3.0
    False
    
  5. Now, for a more complex use case. The documentation says that each new view dimension must either be a subspace of an original dimension, or only span d, d + 1, ..., d + kthat satisfy the following contiguity-like condition that for all i = 0, ..., k - 1, stride[i] = stride[i + 1] x size[i + 1]. Otherwise, contiguous()needs to be called before the tensor can be viewed. For example:

    a = torch.rand(5, 4, 3, 2) # size (5, 4, 3, 2)
    a_t = a.permute(0, 2, 3, 1) # size (5, 3, 2, 4)
    
    # The commented line below will raise a RuntimeError, because one dimension
    # spans across two contiguous subspaces
    # a_t.view(-1, 4)
    
    # instead do:
    a_t.contiguous().view(-1, 4)
    
    # To see why the first one does not work and the second does,
    # compare a.stride() and a_t.stride()
    a.stride() # (24, 6, 2, 1)
    a_t.stride() # (24, 2, 1, 6)
    

    Notice that for a_t, stride[0] != stride[1] x size[1]since 24 != 2 x 3

  1. view方法返回一个与张量具有相同数据的self张量(这意味着返回的张量具有相同数量的元素),但具有不同的形状。例如:

    a = torch.arange(1, 17)  # a's shape is (16,)
    
    a.view(4, 4) # output below
      1   2   3   4
      5   6   7   8
      9  10  11  12
     13  14  15  16
    [torch.FloatTensor of size 4x4]
    
    a.view(2, 2, 4) # output below
    (0 ,.,.) = 
    1   2   3   4
    5   6   7   8
    
    (1 ,.,.) = 
     9  10  11  12
    13  14  15  16
    [torch.FloatTensor of size 2x2x4]
    
  2. 假设它-1不是参数之一,当你将它们相乘时,结果必须等于张量中的元素数。如果您这样做:a.view(3, 3),它将引发 aRuntimeError因为形状(3 x 3)对于具有 16 个元素的输入无效。换句话说:3 x 3 不等于 16,而是 9。

  3. 您可以将其-1用作传递给函数的参数之一,但只能使用一次。所发生的一切就是该方法将为您计算如何填充该维​​度。例如a.view(2, -1, 4)相当于a.view(2, 2, 4). [16 / (2 x 4) = 2]

  4. 请注意,返回的张量共享相同的数据。如果您在“视图”中进行更改,您将更改原始张量的数据:

    b = a.view(4, 4)
    b[0, 2] = 2
    a[2] == 3.0
    False
    
  5. 现在,对于更复杂的用例。文档说,每个新的视图维度必须是原始维度的子空间,或者仅跨越d, d + 1, ..., d + k满足以下类似连续性的条件,对于所有i = 0, . .., k - 1, stride[i] = stride[i + 1] x size[i + 1]。否则,contiguous()需要在可以查看张量之前调用。例如:

    a = torch.rand(5, 4, 3, 2) # size (5, 4, 3, 2)
    a_t = a.permute(0, 2, 3, 1) # size (5, 3, 2, 4)
    
    # The commented line below will raise a RuntimeError, because one dimension
    # spans across two contiguous subspaces
    # a_t.view(-1, 4)
    
    # instead do:
    a_t.contiguous().view(-1, 4)
    
    # To see why the first one does not work and the second does,
    # compare a.stride() and a_t.stride()
    a.stride() # (24, 6, 2, 1)
    a_t.stride() # (24, 2, 1, 6)
    

    请注意,对于a_t, stride[0] != stride[1] x size[1]因为24 != 2 x 3

回答by kmario23

torch.Tensor.view()

torch.Tensor.view()

Simply put, torch.Tensor.view()which is inspired by numpy.ndarray.reshape()or numpy.reshape(), creates a new viewof the tensor, as long as the new shape is compatible with the shape of the original tensor.

简单地说,torch.Tensor.view()numpy.ndarray.reshape()or 的启发numpy.reshape(),创建了张量的新视图,只要新形状与原始张量的形状兼容即可。

Let's understand this in detail using a concrete example.

让我们通过一个具体的例子来详细了解这一点。

In [43]: t = torch.arange(18) 

In [44]: t 
Out[44]: 
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17])

With this tensor tof shape (18,), new viewscan onlybe created for the following shapes:

有了这个张量t的形状(18,),新观点可以为以下形状创建:

(1, 18)or equivalently (1, -1)or (-1, 18)
(2, 9)or equivalently (2, -1)or (-1, 9)
(3, 6)or equivalently (3, -1)or (-1, 6)
(6, 3)or equivalently (6, -1)or (-1, 3)
(9, 2)or equivalently (9, -1)or (-1, 2)
(18, 1)or equivalently (18, -1)or (-1, 1)

(1, 18)或等效 (1, -1)或 或等效 或 或等效 或 或等效 或 或等效 或或等效 或(-1, 18)
(2, 9)(2, -1)(-1, 9)
(3, 6)(3, -1)(-1, 6)
(6, 3)(6, -1)(-1, 3)
(9, 2)(9, -1)(-1, 2)
(18, 1)(18, -1)(-1, 1)

As we can already observe from the above shape tuples, the multiplication of the elements of the shape tuple (e.g. 2*9, 3*6etc.) must alwaysbe equal to the total number of elements in the original tensor (18in our example).

正如我们已经从上面的形状元组观察到的那样,形状元组的元素(例如2*93*6等)的乘法必须始终等于原始张量中元素的总数(18在我们的示例中)。

Another thing to observe is that we used a -1in one of the places in each of the shape tuples. By using a -1, we are being lazy in doing the computation ourselves and rather delegate the task to PyTorch to do calculation of that value for the shape when it creates the new view. One important thing to note is that we can onlyuse a single -1in the shape tuple. The remaining values should be explicitly supplied by us. Else PyTorch will complain by throwing a RuntimeError:

另一件要观察的事情是我们-1在每个形状元组的一个地方使用了 a 。通过使用 a -1,我们懒得自己进行计算,而是将任务委托给 PyTorch 在它创建新视图时计算形状的值。需要注意的一件重要事情是我们只能-1在形状元组中使用单个。其余值应由我们明确提供。否则 PyTorch 会抛出一个RuntimeError:

RuntimeError: only one dimension can be inferred

运行时错误:只能推断一维

So, with all of the above mentioned shapes, PyTorch will always return a new viewof the original tensor t. This basically means that it just changes the stride information of the tensor for each of the new views that are requested.

因此,对于上述所有形状,PyTorch 将始终返回原始 tensor的新视图t。这基本上意味着它只是为请求的每个新视图更改张量的步幅信息。

Below are some examples illustrating how the strides of the tensors are changed with each new view.

下面是一些示例,说明每个新视图如何更改张量的步幅。

# stride of our original tensor `t`
In [53]: t.stride() 
Out[53]: (1,)

Now, we will see the strides for the new views:

现在,我们将看到新视图的进展:

# shape (1, 18)
In [54]: t1 = t.view(1, -1)
# stride tensor `t1` with shape (1, 18)
In [55]: t1.stride() 
Out[55]: (18, 1)

# shape (2, 9)
In [56]: t2 = t.view(2, -1)
# stride of tensor `t2` with shape (2, 9)
In [57]: t2.stride()       
Out[57]: (9, 1)

# shape (3, 6)
In [59]: t3 = t.view(3, -1) 
# stride of tensor `t3` with shape (3, 6)
In [60]: t3.stride() 
Out[60]: (6, 1)

# shape (6, 3)
In [62]: t4 = t.view(6,-1)
# stride of tensor `t4` with shape (6, 3)
In [63]: t4.stride() 
Out[63]: (3, 1)

# shape (9, 2)
In [65]: t5 = t.view(9, -1) 
# stride of tensor `t5` with shape (9, 2)
In [66]: t5.stride()
Out[66]: (2, 1)

# shape (18, 1)
In [68]: t6 = t.view(18, -1)
# stride of tensor `t6` with shape (18, 1)
In [69]: t6.stride()
Out[69]: (1, 1)

So that's the magic of the view()function. It just changes the strides of the (original) tensor for each of the new views, as long as the shape of the new viewis compatible with the original shape.

这就是view()函数的神奇之处。它只是改变每个新视图的(原始)张量的步幅,只要新视图的形状与原始形状兼容。

Another interesting thing one might observe from the strides tuples is that the value of the element in the 0thposition is equal to the value of the element in the 1stposition of the shape tuple.

另一个有趣的事情一个可能会从步幅元组观察的是,在0的元素的值位置是等于1个的元素的值ST形状元组位置。

In [74]: t3.shape 
Out[74]: torch.Size([3, 6])
                        |
In [75]: t3.stride()    |
Out[75]: (6, 1)         |
          |_____________|

This is because:

这是因为:

In [76]: t3 
Out[76]: 
tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11],
        [12, 13, 14, 15, 16, 17]])

the stride (6, 1)says that to go from one element to the next element along the 0thdimension, we have to jumpor take 6 steps. (i.e. to go from 0to 6, one has to take 6 steps.) But to go from one element to the next element in the 1stdimension, we just need only one step (for e.g. to go from 2to 3).

步幅(6, 1)说,从一个元素到下一个元素沿0维度,我们要或采取6个步骤。(即从去06,人们必须采取6个步骤。)但是,从一个元素去的1个一个元素ST层面,我们只需要只差一步(例如,用于从去23)。

Thus, the strides information is at the heart of how the elements are accessed from memory for performing the computation.

因此,步幅信息是如何从内存访问元素以执行计算的核心。



torch.reshape()

火炬.reshape()

This function would return a viewand is exactly the same as using torch.Tensor.view()as long as the new shape is compatible with the shape of the original tensor. Otherwise, it will return a copy.

只要新形状与原始张量的形状兼容,此函数将返回一个视图,并且与 using 完全相同torch.Tensor.view()。否则,它将返回一个副本。

However, the notes of torch.reshape()warns that:

但是,注释torch.reshape()警告说:

contiguous inputs and inputs with compatible strides can be reshaped without copying, but one should not depend on the copying vs. viewing behavior.

连续输入和具有兼容步幅的输入可以在不复制的情况下进行重塑,但不应依赖于复制与查看行为。

回答by Jibin Mathew

weights.reshape(a, b)will return a new tensor with the same data as weights with size (a, b) as in it copies the data to another part of memory.

weights.reshape(a, b)将返回一个新的张量,其数据与大小为 (a, b) 的权重相同,因为它将数据复制到内存的另一部分。

weights.resize_(a, b)returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory.

weights.resize_(a, b)返回具有不同形状的相同张量。但是,如果新形状产生的元素比原始张量少,则某些元素将从张量中删除(但不会从内存中删除)。如果新形状产生的元素多于原始张量,则新元素将在内存中未初始化。

weights.view(a, b)will return a new tensor with the same data as weights with size (a, b)

weights.view(a, b)将返回一个新的张量,其数据与大小为 (a, b) 的权重相同

回答by prosti

What is the meaning of parameter -1?

参数-1是什么意思?

You can read -1as dynamic number of parameters or "anything". Because of that there can be only one parameter -1in view().

您可以读取-1参数的动态数量或“任何”。正因为如此,-1view().

If you ask x.view(-1,1)this will output tensor shape [anything, 1]depending on the number of elements in x. For example:

如果你问x.view(-1,1)这将输出张量形状,[anything, 1]具体取决于x. 例如:

import torch
x = torch.tensor([1, 2, 3, 4])
print(x,x.shape)
print("...")
print(x.view(-1,1), x.view(-1,1).shape)
print(x.view(1,-1), x.view(1,-1).shape)

Will output:

将输出:

tensor([1, 2, 3, 4]) torch.Size([4])
...
tensor([[1],
        [2],
        [3],
        [4]]) torch.Size([4, 1])
tensor([[1, 2, 3, 4]]) torch.Size([1, 4])

回答by FENGSHI ZHENG

I figured it out that x.view(-1, 16 * 5 * 5)is equivalent to x.flatten(1), where the parameter 1 indicates the flatten process starts from the 1st dimension(not flattening the 'sample' dimension) As you can see, the latter usage is semantically more clear and easier to use, so I prefer flatten().

我发现它x.view(-1, 16 * 5 * 5)等价于x.flatten(1),其中参数 1 表示展平过程从第 1 个维度开始(不是展平“样本”维度) 正如您所看到的,后者的用法在语义上更清晰,更易于使用,所以我更喜欢flatten()

回答by ychnh

I really liked @Jadiel de Armas examples.

我真的很喜欢@Jadiel de Armas 的例子。

I would like to add a small insight to how elements are ordered for .view(...)

我想对 .view(...)

  • For a Tensor with shape (a,b,c), the orderof it's elements are determined by a numbering system: where the first digit has anumbers, second digit has bnumbers and third digit has cnumbers.
  • The mapping of the elements in the new Tensor returned by .view(...) preserves this orderof the original Tensor.
  • 对于形状为(a,b,c)的张量,其元素的顺序由编号系统确定:第一个数字有一个数字,第二个数字有b 个数字,第三个数字有c 个数字。
  • .view(...) 返回的新张量中元素的映射保留了原始张量的这种顺序