Python Pytorch：如何将数据转换为张量

Question

提问by soshi shimada

I am a beginner for Pytorch. I was trying to write CNN code referring Pytorch tutorial. Below is a part of the code, but it shows error "RuntimeError: Variable data has to be a tensor, but got list". I tried to cast input data to tensor but didn't work well. If anybody know the solution, please help me out...

我是 Pytorch 的初学者。我试图编写参考 Pytorch 教程的 CNN 代码。下面是代码的一部分，但它显示错误“RuntimeError：变量数据必须是张量，但得到了列表”。我试图将输入数据转换为张量，但效果不佳。如果有人知道解决方案，请帮助我...

    def read_labels(file):
      dic = {}
      with open(file) as f:
        reader = f
        for row in reader:
            dic[row.split(",")[0]]  = row.split(",")[1].rstrip() #rstrip(): eliminate "\n"
      return dic

    image_names= os.listdir("./train_mini")
    label_dic = read_labels("labels.csv")


    names =[]
    labels = []
    images =[]

    for name in image_names[1:]:
        images.append(cv2.imread("./train_mini/"+name))
        labels.append(label_dic[os.path.splitext(name)[0]])

    """
    Data distribution
    """
    N = len(images)
    N_train = int(N * 0.7)
    N_test = int(N*0.2)

    X_train, X_tmp, Y_train, Y_tmp = train_test_split(images, labels, train_size=N_train)
    X_validation, X_test, Y_validation, Y_test = train_test_split(X_tmp, Y_tmp, test_size=N_test)

    """
    Model Definition
    """

    class CNN(nn.Module):
        def __init__(self):
            super(CNN, self).__init__()
            self.head = nn.Sequential(
                nn.Conv2d(in_channels=1, out_channels=10,
                          kernel_size=5, stride=1),
                nn.MaxPool2d(kernel_size=2),
                nn.ReLU(),
                nn.Conv2d(10, 20, kernel_size=5),
                nn.MaxPool2d(kernel_size=2),
                nn.ReLU())
            self.tail = nn.Sequential(
                nn.Linear(320, 50),
                nn.ReLU(),
                nn.Linear(50, 10))

        def forward(self, x):
            x = self.head(x)
            x = x.view(-1, 320)
            x = self.tail(x)
            return F.log_softmax(x)

    CNN = CNN()
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(CNN.parameters(), lr=0.001, momentum=0.9)


    """
    Training
    """
    batch_size = 50
    for epoch in range(2):  # loop over the dataset multiple times
        running_loss = 0.0
        for i in range(N / batch_size):
        #for i, data in enumerate(trainloader, 0):
            batch = batch_size * i

            # get the inputs
            images_batch = X_train[batch:batch + batch_size]
            labels_batch = Y_train[batch:batch + batch_size]

            # wrap them in Variable
            images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward + backward + optimize
            outputs = CNN(images_batch)
            loss = criterion(outputs, labels_batch)
            loss.backward()
            optimizer.step()

            # print statistics
            running_loss += loss.data[0]
            if i % 2000 == 1999:    # print every 2000 mini-batches
                print('[%d, %5d] loss: %.3f' %
                      (epoch + 1, i + 1, running_loss / 2000))
                running_loss = 0.0

    print('Finished Training')

And error is happening here

错误发生在这里

# wrap them in Variable
images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

Answer 1

回答by Wasi Ahmad

If my guess is correct, you are probably getting error in the following line.

如果我的猜测是正确的，那么您可能会在以下行中遇到错误。

# wrap them in Variable
images_batch, labels_batch = Variable(images_batch), Variable(labels_batch)

It means, images_batchand/or labels_batchare lists. You can simple convert them to numpy array and then convert to tensor as follows.

这意味着images_batch和/或是labels_batch列表。您可以简单地将它们转换为 numpy 数组，然后转换为张量，如下所示。

# wrap them in Variable
images_batch = torch.from_numpy(numpy.array(images_batch))
labels_batch = torch.from_numpy(numpy.array(labels_batch))

It should solve your problem.

它应该可以解决您的问题。

Edit: If you get the following error while running the above snippet of code:

编辑：如果在运行上述代码片段时出现以下错误：

"RuntimeError: can't convert a given np.ndarray to a tensor - it has an invalid type. The only supported types are: double, float, int64, int32, and uint8."

“运行时错误：无法将给定的 np.ndarray 转换为张量 - 它的类型无效。唯一支持的类型是：double、float、int64、int32 和 uint8。”

You can create the numpy array by giving a data type. For example,

您可以通过提供数据类型来创建 numpy 数组。例如，

images_batch = torch.from_numpy(numpy.array(images_batch, dtype='int32'))

I am assuming images_batchcontains pixel information of images, so I used int32. For more information, see official documentation.

我假设images_batch包含图像的像素信息，所以我使用int32. 更多信息请参见官方文档。

Python Pytorch：如何将数据转换为张量

提问by soshi shimada

回答by Wasi Ahmad

相关推荐

最近更新

标签

Python Pytorch：如何将数据转换为张量

提问by soshi shimada

回答by Wasi Ahmad

相关推荐

Python 为什么 Jupyter 无法读取 .csv 文件？

Python UUID('...') 不是 JSON 可序列化的

Python 如何在seaborn pairplot中调整透明度（alpha）？

根据python中的条件获取两列pandas数据框之间的差异

相关推荐

最近更新

标签