pandas 如何在新图像上使用 .predict_generator() - Keras

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/52270177/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:01:17  来源:igfitidea点击:

How to use .predict_generator() on new Images - Keras

pythonpandasimage-processingkeras

提问by Debadri Dutta

I've used ImageDataGeneratorand flow_from_directoryfor training and validation.

我已经使用ImageDataGeneratorflow_from_directory进行培训和验证。

These are my directories:

这些是我的目录:

train_dir = Path('D:/Datasets/Trell/images/new_images/training')
test_dir = Path('D:/Datasets/Trell/images/new_images/validation')
pred_dir = Path('D:/Datasets/Trell/images/new_images/testing')

ImageGenerator Code:

图像生成器代码:

img_width, img_height = 28, 28
batch_size=32
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

Found 1852 images belonging to 4 classes

Found 115 images belonging to 4 classes

找到属于 4 个类别的 1852 张图片

找到属于 4 个类别的 115 个图像

This is my model training code:

这是我的模型训练代码:

history = cnn.fit_generator(
        train_generator,
        steps_per_epoch=1852 // batch_size,
        epochs=20,
        validation_data=validation_generator,
        validation_steps=115 // batch_size)

Now I have some new images in a test folder (all images are inside the same folder only), on which I want to predict. But when I use .predict_generatorI get:

现在我在测试文件夹中有一些新图像(所有图像仅在同一个文件夹中),我想对其进行预测。但是当我使用时,.predict_generator我得到:

Found 0 images belonging to 0 class

找到属于 0 个类别的 0 个图像

So I tried these solutions:

所以我尝试了这些解决方案:

1) Keras: How to use predict_generator with ImageDataGenerator?This didn't work out, because its trying on validation set only.

1) Keras:如何将 predict_generator 与 ImageDataGenerator 一起使用?这没有成功,因为它仅尝试验证集。

2) How to predict the new image by using model.predict?module image not found

2)如何使用model.predict来预测新图像?module image not found

3) How to get predictions with predict_generator on streaming test data in Keras?This also didn't work out.

3)如何使用 predict_generator 对 Keras 中的流测试数据进行预测?这也没有奏效。

My train data is basically stored in 4 separate folders, i.e. 4 specific classes, validation also stored in same way and works out pretty well.

我的火车数据基本上存储在 4 个单独的文件夹中,即 4 个特定的类,验证也以相同的方式存储并且效果很好。

So in my test folder I have around 300 images, on which I want to predict and make a dataframe, like this:

所以在我的测试文件夹中,我有大约 300 张图像,我想在这些图像上预测并制作一个数据框,如下所示:

image_name    class
gghh.jpg       1
rrtq.png       2
1113.jpg       1
44rf.jpg       4
tyug.png       1
ssgh.jpg       3

I have also used this following code:

我还使用了以下代码:

img = image.load_img(pred_dir, target_size=(28, 28))
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis=0)
img_tensor /= 255.

cnn.predict(img_tensor)

But I get this error: [Errno 13] Permission denied: 'D:\\Datasets\\Trell\\images\\new_images\\testing'

但我收到此错误: [Errno 13] Permission denied: 'D:\\Datasets\\Trell\\images\\new_images\\testing'

But I haven't been able to predict_generatoron my test images. So how can I predict on my new images using Keras. I have googled a lot, searched on Kaggle Kernels also but haven't been able to get a solution.

但我一直无法predict_generator在我的测试图像上。那么我如何使用 Keras 预测我的新图像。我在谷歌上搜索了很多,也在 Kaggle Kernels 上搜索过,但没有找到解决方案。

回答by Debadri Dutta

So first of all the test images should be placed inside a separate folder inside the test folder. So in my case I made another folder inside testfolder and named it all_classes. Then ran the following code:

因此,首先应将测试图像放置在测试文件夹内的单独文件夹中。所以就我而言,我在文件test夹中创建了另一个文件夹并将其命名为all_classes. 然后运行以下代码:

test_generator = test_datagen.flow_from_directory(
    directory=pred_dir,
    target_size=(28, 28),
    color_mode="rgb",
    batch_size=32,
    class_mode=None,
    shuffle=False
)

The above code gives me an output:

上面的代码给了我一个输出:

Found 306 images belonging to 1 class

找到属于 1 个类别的 306 个图像

And most importantly you've to write the following code:

最重要的是,您必须编写以下代码:

test_generator.reset()

test_generator.reset()

else weird outputs will come. Then using the .predict_generator()function:

否则会出现奇怪的输出。然后使用.predict_generator()函数:

pred=cnn.predict_generator(test_generator,verbose=1,steps=306/batch_size)

pred=cnn.predict_generator(test_generator,verbose=1,steps=306/batch_size)

Running the above code will give output in probabilities so at first I need to convert them to class number. In my case it was 4 classes, so class numbers were 0,1,2 and 3.

运行上面的代码将给出概率输出,所以首先我需要将它们转换为类号。就我而言,它是 4 个班级,因此班级编号为 0、1、2 和 3。

Code written:

编写的代码:

predicted_class_indices=np.argmax(pred,axis=1)

predicted_class_indices=np.argmax(pred,axis=1)

Next step is I want the name of the classes:

下一步是我想要类的名称:

labels = (train_generator.class_indices)
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]

Where by class numbers will be replaced by the class names. One final step if you want to save it to a csv file, arrange it in a dataframe with the image names appended with the class predicted.

where by class numbers 将替换为 class 名称。如果要将其保存到 csv 文件,最后一步是将其排列在数据框中,并在图像名称后附加预测的类。

filenames=test_generator.filenames
results=pd.DataFrame({"Filename":filenames,
                      "Predictions":predictions})

Display your dataframe. Everything is done now. You get all the predicted class for your images.

显示您的数据框。现在一切都完成了。您将获得图像的所有预测类别。

回答by Peter

I had some trouble with predict_generator(). Some posts here helped a lot. I post my solution here as well and hope it will help others. What I do:

我遇到了一些麻烦predict_generator()。这里的一些帖子帮助很大。我也在这里发布我的解决方案,希望它能帮助其他人。我所做的:

  • Make predictions on new images using predict_generator()
  • Get filename for each prediction
  • Store results in a data frame
  • 使用新图像进行预测 predict_generator()
  • 获取每个预测的文件名
  • 将结果存储在数据框中

I make binary predictions à la "cats and dogs" as documented here. However, the logic can be generalised to multiclass cases. In this case the outcome of the prediction has one column per class.

我根据此处记录的“猫和狗”进行二元预测。但是,该逻辑可以推广到多类情况。在这种情况下,预测结果每类有一列。

First, I load my stored model and set up the data generator:

首先,我加载存储的模型并设置数据生成器:

import numpy as np
import pandas as pd
from keras.preprocessing.image import ImageDataGenerator
from keras.models import load_model

# Load model
model = load_model('my_model_01.hdf5')

test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
        "C:/kerasimages/pred/",
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary',
        shuffle=False)

Note:it is important to specify shuffle=Falsein order to preserve the order of filenames and predictions.

注意:shuffle=False为了保留文件名和预测的顺序,指定是很重要的。

Images are stored in C:/kerasimages/pred/images/. The data generator will only look for images in subfoldersof C:/kerasimages/pred/(as specified in test_generator). It is important to respect the logic of the data generator, so the subfolder /images/is required. Each subfolder in C:/kerasimages/pred/is interpreted as one class by the generator. Here, the generator will report Found x images belonging to 1 classes(since there is only one subfolder). If we make predictions, classes (as detected by the generator) are not relevant.

图像存储在C:/kerasimages/pred/images/. 数据生成器将只在(如 中指定的)的子文件夹中查找图像。尊重数据生成器的逻辑很重要,所以需要子文件夹。中的每个子文件夹都 被生成器解释为一个类。在这里,生成器将报告(因为只有一个子文件夹)。如果我们进行预测,类(由生成器检测到的)是不相关的。C:/kerasimages/pred/test_generator/images/C:/kerasimages/pred/Found x images belonging to 1 classes

Now, I can make predictions using the generator:

现在,我可以使用生成器进行预测:

# Predict from generator (returns probabilities)
pred=model.predict_generator(test_generator, steps=len(test_generator), verbose=1)

Resetting the generator is not required in this case, but if a generator has been set up before, it may be necessary to rest it using test_generator.reset().

在这种情况下不需要重置生成器,但如果之前已经设置过生成器,则可能需要使用test_generator.reset().

Next I round probabilities to get classes and I retrieve filenames:

接下来,我舍入概率以获取类并检索文件名:

# Get classes by np.round
cl = np.round(pred)
# Get filenames (set shuffle=false in generator is important)
filenames=test_generator.filenames

Finally, results can be stored in a data frame:

最后,结果可以存储在数据框中:

# Data frame
results=pd.DataFrame({"file":filenames,"pr":pred[:,0], "class":cl[:,0]})

回答by Mariusz Ch.

The most probably you are making a mistake using flow_from_directory. Reading the docs:

您最有可能在使用flow_from_directory. 阅读文档:

flow_from_directory(directory, ...)

flow_from_directory(目录,...)

Where:

在哪里:

directory: Path to the target directory. It should contain one subdirectory per class. Any PNG, JPG, BMP, PPM or TIF images inside each of the subdirectories directory tree will be included in the generator.

directory:目标目录的路径。每个类应该包含一个子目录。每个子目录目录树中的任何 PNG、JPG、BMP、PPM 或 TIF 图像都将包含在生成器中。

That means that inside the directory that you are passing to this function, you have to create subdirectories and place your images inside this subdirectories. Otherwise, when the images are in the directory that you are passing (not subdirectories), indeed there are 0 images and 0 classes.

这意味着在传递给此函数的目录中,您必须创建子目录并将图像放置在该子目录中。否则,当图像在您传递的目录(不是子目录)中时,确实有 0 个图像和 0 个类。

EDIT

编辑

Okay so in case of the prediction you want to perform I believe that you want to use the predictfunction as follows: (note that you have to provide data to the network just in the same format as you did during learning process)

好的,如果您要执行预测,我相信您希望使用以下predict功能:(请注意,您必须以与学习过程中相同的格式向网络提供数据)

image = img_to_array(load_img(f"{directory}/{foldername}/{filename}"))
# here you prepare the input data, for example here we take the gray image
# gray scale is the 1st channel in the Lab color space
color_me = rgb2lab((1.0 / 255) * color_me)[:, :, 0]
color_me = color_me.reshape(color_me.shape + (1,))
# here data is in the format which is accepted by, in this case, my model
# for your model you have to do the preparation just the same as in the case of learning process
output = model.predict(np.array([color_me]))
# and here you have your predicted output

回答by B. Kanani

I strongly recommend you to make a parent folder in the test folder. Then move the test folder to the parent folder.

我强烈建议您在 test 文件夹中创建一个父文件夹。然后将测试文件夹移动到父文件夹。

means if you have test folder in this manner:

意味着如果您以这种方式拥有测试文件夹:

/root/test/img1.png
/root/test/img2.png
/root/test/img3.png
/root/test/img4.png

this wrong way to use predict_generator. Update your test folder like this:

这种错误的使用 predict_generator 的方式。像这样更新您的测试文件夹:

/root/test_parent/test/img1.png
/root/test_parent/test/img2.png
/root/test_parent/test/img3.png
/root/test_parent/test/img4.png

Use this command to update:

使用此命令更新:

mv /root/test/ ./root/test_parent/test 

And, also don't forget to give a path to the model like this

而且,也不要忘记像这样给出模型的路径

"/root/test_parent/"

This method is work for me.

这种方法对我有用。