Python Pytorch:如何将数据转换为张量

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47272971/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:07:15  来源:igfitidea点击:

Pytorch: how to convert data into tensor

pythonmachine-learningdeep-learningpytorch

提问by soshi shimada

I am a beginner for Pytorch. I was trying to write CNN code referring Pytorch tutorial. Below is a part of the code, but it shows error "RuntimeError: Variable data has to be a tensor, but got list". I tried to cast input data to tensor but didn't work well. If anybody know the solution, please help me out...

我是 Pytorch 的初学者。我试图编写参考 Pytorch 教程的 CNN 代码。下面是代码的一部分,但它显示错误“RuntimeError:变量数据必须是张量,但得到了列表”。我试图将输入数据转换为张量,但效果不佳。如果有人知道解决方案,请帮助我...

    def read_labels(file):
      dic = {}
      with open(file) as f:
        reader = f
        for row in reader:
            dic[row.split(",")[0]]  = row.split(",")[1].rstrip() #rstrip(): eliminate "\n"
      return dic

    image_names= os.listdir("./train_mini")
    label_dic = read_labels("labels.csv")


    names =[]
    labels = []
    images =[]

    for name in image_names[1:]:
        images.append(cv2.imread("./train_mini/"+name))
        labels.append(label_dic[os.path.splitext(name)[0]])

    """
    Data distribution
    """
    N = len(images)
    N_train = int(N * 0.7)
    N_test = int(N*0.2)

    X_train, X_tmp, Y_train, Y_tmp = train_test_split(images, labels, train_size=N_train)
    X_validation, X_test, Y_validation, Y_test = train_test_split(X_tmp, Y_tmp, test_size=N_test)

    """
    Model Definition
    """

    class CNN(nn.Module):
        def __init__(self):
            super(CNN, self).__init__()
            self.head = nn.Sequential(
                nn.Conv2d(in_channels=1, out_channels=10,
                          kernel_size=5, stride=1),
                nn.MaxPool2d(kernel_size=2),
                nn.ReLU(),
                nn.Conv2d(10, 20, kernel_size=5),
                nn.MaxPool2d(kernel_size=2),
                nn.ReLU())
            self.tail = nn.Sequential(
                nn.Linear(320, 50),
                nn.ReLU(),
                nn.Linear(50, 10))

        def forward(self, x):
            x = self.head(x)
            x = x.view(-1, 320)
            x = self.tail(x)
            return F.log_softmax(x)

    CNN = CNN()
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(CNN.parameters(), lr=0.001, momentum=0.9)


    """
    Training
    """
    batch_size = 50
    for epoch in range(2):  # loop over the dataset multiple times
        running_loss = 0.0
        for i in range(N / batch_size):
        #for i, data in enumerate(trainloader, 0):
            batch = batch_size * i

            # get the inputs
            images_batch = X_train[batch:batch + batch_size]
            labels_batch = Y_train[batch:batch + batch_size]

            # wrap them in Variable
            images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward + backward + optimize
            outputs = CNN(images_batch)
            loss = criterion(outputs, labels_batch)
            loss.backward()
            optimizer.step()

            # print statistics
            running_loss += loss.data[0]
            if i % 2000 == 1999:    # print every 2000 mini-batches
                print('[%d, %5d] loss: %.3f' %
                      (epoch + 1, i + 1, running_loss / 2000))
                running_loss = 0.0

    print('Finished Training')

And error is happening here

错误发生在这里

# wrap them in Variable
images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

回答by Wasi Ahmad

If my guess is correct, you are probably getting error in the following line.

如果我的猜测是正确的,那么您可能会在以下行中遇到错误。

# wrap them in Variable
images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

It means, images_batchand/or labels_batchare lists. You can simple convert them to numpy array and then convert to tensor as follows.

这意味着images_batch和/或是labels_batch列表。您可以简单地将它们转换为 numpy 数组,然后转换为张量,如下所示。

# wrap them in Variable
images_batch = torch.from_numpy(numpy.array(images_batch))
labels_batch = torch.from_numpy(numpy.array(labels_batch))

It should solve your problem.

它应该可以解决您的问题。



Edit: If you get the following error while running the above snippet of code:

编辑:如果在运行上述代码片段时出现以下错误:

"RuntimeError: can't convert a given np.ndarray to a tensor - it has an invalid type. The only supported types are: double, float, int64, int32, and uint8."

“运行时错误:无法将给定的 np.ndarray 转换为张量 - 它的类型无效。唯一支持的类型是:double、float、int64、int32 和 uint8。”

You can create the numpy array by giving a data type. For example,

您可以通过提供数据类型来创建 numpy 数组。例如,

images_batch = torch.from_numpy(numpy.array(images_batch, dtype='int32'))

I am assuming images_batchcontains pixel information of images, so I used int32. For more information, see official documentation.

我假设images_batch包含图像的像素信息,所以我使用int32. 更多信息请参见官方文档