Python keras 中的 preprocess_input() 方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47555829/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:15:34  来源:igfitidea点击:

preprocess_input() method in keras

pythonkeras

提问by AKSHAYAA VAIDYANATHAN

I am trying out sample kerascode from the below kerasdocumentation page, https://keras.io/applications/

我正在尝试keras以下keras文档页面中的示例代码, https://keras.io/applications/

What preprocess_input(x)function of kerasmodule does in the below code? Why do we have to do expand_dims(x, axis=0)before that is passed to the preprocess_input()method?

以下代码preprocess_input(x)keras模块的功能是什么?为什么我们要expand_dims(x, axis=0)在传递给preprocess_input()方法之前做呢?

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

Is there any documentation with a good explanation of these functions?

是否有任何文档对这些功能进行了很好的解释?

Thanks!

谢谢!

回答by Daniel M?ller

Keras works with batches of images. So, the first dimension is used for the number of samples (or images) you have.

Keras 处理批量图像。因此,第一个维度用于表示您拥有的样本(或图像)数量。

When you load a single image, you get the shape of one image, which is (size1,size2,channels).

当您加载单个图像时,您将获得一个图像的形状,即(size1,size2,channels).

In order to create a batch of images, you need an additional dimension: (samples, size1,size2,channels)

为了创建一批图像,您需要一个额外的维度: (samples, size1,size2,channels)

The preprocess_inputfunction is meant to adequate your image to the format the model requires.

preprocess_input功能旨在使您的图像适合模型所需的格式。

Some models use images with values ranging from 0 to 1. Others from -1 to +1. Others use the "caffe" style, that is not normalized, but is centered.

一些模型使用值范围从 0 到 1 的图像。其他模型从 -1 到 +1。其他人使用“caffe”风格,这不是标准化的,而是居中的。

From the source code, Resnet is using the caffe style.

源码看,Resnet 使用的是caffe 风格。

You don't need to worry about the internal details of preprocess_input. But ideally, you should load images with the keras functions for that (so you guarantee that the images you load are compatible with preprocess_input).

您无需担心preprocess_input. 但理想情况下,您应该使用 keras 函数加载图像(因此您保证加载的图像与 兼容preprocess_input)。

回答by DK250

This loads an image and resizes the image to (224, 224):

这将加载图像并将图像大小调整为 (224, 224):

 img = image.load_img(img_path, target_size=(224, 224))

The img_to_array() function adds channels: x.shape = (224, 224, 3)for RGB and (224, 224, 1)for gray image

img_to_array() 函数添加通道:x.shape = (224, 224, 3)RGB 和(224, 224, 1)灰度图像

 x = image.img_to_array(img) 

expand_dims()is used to add the number of images: x.shape = (1, 224, 224, 3):

expand_dims()用于添加图像数量x.shape = (1, 224, 224, 3)::

x = np.expand_dims(x, axis=0)

preprocess_input subtracts the mean RGB channels of the imagenet dataset. This is because the model you are using has been trained on a different dataset: x.shapeis still (1, 224, 224, 3)

preprocess_input 减去 imagenet 数据集的平均 RGB 通道。这是因为您使用的模型已经在不同的数据集上训练过:x.shape仍然是(1, 224, 224, 3)

x = preprocess_input(x)

If you add xto an array images, at the end of the loop, you need to add images = np.vstack(images)so that you get (n, 224, 224, 3)as the dim of images where nis the number of images processed

如果添加x到数组images,则在循环结束时,您需要添加,images = np.vstack(images)以便获得(n, 224, 224, 3)图像的暗淡,其中n处理的图像数量

回答by Th? Sinh

I found that preprocessing your data while yours is a too different dataset vs the pre_trained model/dataset, then it may harm your accuracy somehow. If you do transfer learning and freezing some layers from a pre_trained model/their weights, simply /255.0 your original dataset does the job just fine, at least for large 1/2 millions samples food dataset. Ideally you should know your std/mean of you dataset and use it instead of using std/mdean of the pre-trained model preprocess.

我发现预处理你的数据而你的数据与 pre_trained 模型/数据集的数据集太不同了,那么它可能会以某种方式损害你的准确性。如果您进行迁移学习并从预训练模型/它们的权重中冻结一些层,只需 /255.0 您的原始数据集就可以很好地完成工作,至少对于 1/2 百万个样本食品数据集而言是这样。理想情况下,您应该知道数据集的标准/平均值并使用它而不是使用预训练模型预处理的标准/mdean。

My 2 cents.

我的 2 美分。

Steve

史蒂夫