Python 如何像 MNIST 数据集一样创建图像数据集?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39289285/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:05:14  来源:igfitidea点击:

How to create a Image Dataset just like MNIST dataset?

pythonimage-processingdatasetneural-networkkeras

提问by Md Shopon

I have 10000 BMP images of some handwritten digits. If i want to feed the datas to a neural network what do i need to do ? For MNIST dataset i just had to write

我有 10000 张手写数字的 BMP 图像。如果我想将数据提供给神经网络,我需要做什么?对于 MNIST 数据集,我只需要写

(X_train, y_train), (X_test, y_test) = mnist.load_data()

I am using Keras library in python . How can i create such dataset ?

我在 python 中使用 Keras 库。我怎样才能创建这样的数据集?

采纳答案by Mikael Rousson

You can either write a function that loads all your images and stack them into a numpy array if all fits in RAM or use Keras ImageDataGenerator (https://keras.io/preprocessing/image/) which includes a function flow_from_directory. You can find an example here https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d.

您可以编写一个函数来加载所有图像并将它们堆叠到一个 numpy 数组中(如果所有图像都适合 RAM),或者使用包含函数的Keras ImageDataGenerator ( https://keras.io/preprocessing/image/) flow_from_directory。你可以在这里找到一个例子https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d

回答by azharimran

You should write your own function to load all the images or do it like:

您应该编写自己的函数来加载所有图像或执行以下操作:

imagePaths = sorted(list(paths.list_images(args["testset"])))

# loop over the input images
for imagePath in imagePaths:
    # load the image, pre-process it, and store it in the data list
    image = cv2.imread(imagePath)
    image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
    image = img_to_array(image)
    data.append(image)
    # extract the class label from the image path and update the
    # labels list


data = np.array(data, dtype="float") / 255.0

回答by yucui xiao

numpy can save array to file as binary numpy save

numpy 可以将数组保存到文件中作为二进制 numpy 保存

import numpy as np

def save_data():
  [images, labels] = read_data()
  outshape = len(images[0])
  npimages = np.empty((0, outshape), dtype=np.int32)
  nplabels = np.empty((0,), dtype=np.int32)

  for i in range(len(labels)):
      label = labels[i]
      npimages = np.append(npimages, [images[i]], axis=0)
      nplabels = np.append(nplabels, y)

  np.save('images', npimages)
  np.save('labels', nplabels)


def read_data():
  return [np.load('images.npy'), np.load('labels.npy')]