Python 从 Keras 功能模型中获取类标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38971293/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get class labels from Keras functional model
提问by Ledzz
I have a functional model in Keras (Resnet50 from repo examples). I trained it with ImageDataGenerator
and flow_from_directory
data and saved model to .h5
file. When I call model.predict
I get an array of class probabilities. But I want to associate them with class labels (in my case - folder names). How can I get them? I found that I could use model.predict_classes
and model.predict_proba
, but I don't have these functions in Functional model, only in Sequential.
我在 Keras 中有一个功能模型(来自 repo 示例的 Resnet50)。我用数据训练它ImageDataGenerator
并将flow_from_directory
模型保存到.h5
文件中。当我打电话时,model.predict
我得到一个类概率数组。但我想将它们与类标签(在我的情况下 - 文件夹名称)相关联。我怎样才能得到它们?我发现我可以使用model.predict_classes
and model.predict_proba
,但是我在 Functional 模型中没有这些功能,只有在 Sequential 中。
回答by Emilia Apostolova
回答by Lokesh Kumar
When one uses flow_from_directory the problem is how to interpret the probability outputs. As in, how to map the probability outputs and the class labels as how flow_from_directory creates one-hot vectors is not known in prior.
当使用 flow_from_directory 时,问题是如何解释概率输出。比如,如何将概率输出和类标签映射为 flow_from_directory 如何创建 one-hot 向量在之前是未知的。
We can get a dictionary that maps the class labels to the index of the prediction vector that we get as the output when we use
我们可以得到一个字典,将类标签映射到我们使用时作为输出得到的预测向量的索引
generator= train_datagen.flow_from_directory("train", batch_size=batch_size)
label_map = (generator.class_indices)
The label_map variable is a dictionary like this
label_map 变量是这样的字典
{'class_14': 5, 'class_10': 1, 'class_11': 2, 'class_12': 3, 'class_13': 4, 'class_2': 6, 'class_3': 7, 'class_1': 0, 'class_6': 10, 'class_7': 11, 'class_4': 8, 'class_5': 9, 'class_8': 12, 'class_9': 13}
Then from this the relation can be derived between the probability scores and class names.
然后由此可以推导出概率分数和类名之间的关系。
Basically, you can create this dictionary by this code.
基本上,您可以通过此代码创建此字典。
from glob import glob
class_names = glob("*") # Reads all the folders in which images are present
class_names = sorted(class_names) # Sorting them
name_id_map = dict(zip(class_names, range(len(class_names))))
The variable name_id_map in the above code also contains the same dictionary as the one obtained from class_indices function of flow_from_directory.
上面代码中的变量name_id_map也包含与flow_from_directory的class_indices函数获得的字典相同的字典。
Hope this helps!
希望这可以帮助!
回答by Bohumir Zamecnik
UPDATE: This is no longer valid for newer Keras versions. Please use argmax()
as in the answer from Emilia Apostolova.
更新:这对较新的 Keras 版本不再有效。请使用argmax()
Emilia Apostolova 的回答。
The functional API models have just the predict()
function which for classification would return the class probabilities. You can then select the most probable classes using the probas_to_classes()
utility function. Example:
函数式 API 模型仅predict()
具有用于分类将返回类概率的函数。然后您可以使用probas_to_classes()
效用函数选择最可能的类。例子:
y_proba = model.predict(x)
y_classes = keras.np_utils.probas_to_classes(y_proba)
This is equivalent to model.predict_classes(x)
on the Sequential model.
这相当于model.predict_classes(x)
在 Sequential 模型上。
The reason for this is that the functional API support more general class of tasks where predict_classes()
would not make sense.
这样做的原因是函数式 API 支持更一般的任务类,而这些任务predict_classes()
没有意义。
回答by Hemerson Tacon
In addition to @Emilia Apostolova answer to get the ground truth labels, from
除了@Emilia Apostolova 回答以获取基本事实标签之外,来自
generator = train_datagen.flow_from_directory("train", batch_size=batch_size)
just call
打电话
y_true_labels = generator.classes
回答by Thomas Decaux
You must use the labels index you have, here what I do for text classification:
您必须使用您拥有的标签索引,这里是我为文本分类所做的:
# data labels = [1, 2, 1...]
labels_index = { "website" : 0, "money" : 1 ....}
# to feed model
label_categories = to_categorical(np.asarray(labels))
Then, for predictions:
然后,对于预测:
texts = ["hello, rejoins moi sur skype", "bonjour comment ?a va ?", "tu me donnes de l'argent"]
sequences = tokenizer.texts_to_sequences(texts)
data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
predictions = model.predict(data)
t = 0
for text in texts:
i = 0
print("Prediction for \"%s\": " % (text))
for label in labels_index:
print("\t%s ==> %f" % (label, predictions[t][i]))
i = i + 1
t = t + 1
This gives:
这给出:
Prediction for "hello, rejoins moi sur skype":
website ==> 0.759483
money ==> 0.037091
under ==> 0.010587
camsite ==> 0.114436
email ==> 0.075975
abuse ==> 0.002428
Prediction for "bonjour comment ?a va ?":
website ==> 0.433079
money ==> 0.084878
under ==> 0.048375
camsite ==> 0.036674
email ==> 0.369197
abuse ==> 0.027798
Prediction for "tu me donnes de l'argent":
website ==> 0.006223
money ==> 0.095308
under ==> 0.003586
camsite ==> 0.003115
email ==> 0.884112
abuse ==> 0.007655
回答by Fedor Petrov
It is possible to save a "list" of labels in keras model directly. This way the user who uses the model for predictions and does not have any other sources of information can perform the lookup himself. Here is a dummy example of how one can perform an "injection" of labels
可以直接在 keras 模型中保存标签“列表”。这样,使用模型进行预测并且没有任何其他信息来源的用户可以自己执行查找。这是一个如何执行标签“注入”的虚拟示例
# assume we get labels as list
labels = ["cat","dog","horse","tomato"]
# here we start building our model with input image 299x299 and one output layer
xx = Input(shape=(299,299,3))
flat = Flatten()(xx)
output = Dense(shape=(4))(flat)
# here we perform injection of labels
tf_labels = tf.constant([labels],dtype="string")
tf_labels = tf.tile(labels,[tf.shape(xx)[0],1])
output_labels = Lambda(lambda x: tf_labels,name="label_injection")(xx)
#and finaly creating a model
model=tf.keras.Model(xx,[output,output_labels])
When used for prediction, this model returns tensor of scores and tensot of string labels. Model like this can be saved to h5. In this case the file contains the labels. This model can also be exported to saved_model and used for serving in the cloud.
当用于预测时,该模型返回分数张量和字符串标签张量。这样的模型可以保存到h5。在这种情况下,文件包含标签。该模型也可以导出到saved_model,用于云端服务。
回答by Peter
To map predicted classes and filenames using ImageDataGenerator
, I use:
要使用 映射预测的类和文件名ImageDataGenerator
,我使用:
# Data generator and prediction
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
inputpath,
target_size=(150, 150),
batch_size=20,
class_mode='categorical',
shuffle=False)
pred = model.predict_generator(test_generator, steps=len(test_generator), verbose=0)
# Get classes by max element in np (as a list)
classes = list(np.argmax(pred, axis=1))
# Get filenames (set shuffle=false in generator is important)
filenames = test_generator.filenames
I can loop over predicted classes and the associated filename using:
我可以使用以下方法循环预测类和相关文件名:
for f in zip(classes, filenames):
...