Python 从 Keras 功能模型中获取类标签

Question

提问by Ledzz

I have a functional model in Keras (Resnet50 from repo examples). I trained it with ImageDataGeneratorand flow_from_directorydata and saved model to .h5file. When I call model.predictI get an array of class probabilities. But I want to associate them with class labels (in my case - folder names). How can I get them? I found that I could use model.predict_classesand model.predict_proba, but I don't have these functions in Functional model, only in Sequential.

我在 Keras 中有一个功能模型（来自 repo 示例的 Resnet50）。我用数据训练它ImageDataGenerator并将flow_from_directory模型保存到.h5文件中。当我打电话时，model.predict我得到一个类概率数组。但我想将它们与类标签（在我的情况下 - 文件夹名称）相关联。我怎样才能得到它们？我发现我可以使用model.predict_classesand model.predict_proba，但是我在 Functional 模型中没有这些功能，只有在 Sequential 中。

Answer 1

回答by Emilia Apostolova

y_prob = model.predict(x) 
y_classes = y_prob.argmax(axis=-1)

As suggested here.

正如这里所建议的。

Answer 2

回答by Lokesh Kumar

When one uses flow_from_directory the problem is how to interpret the probability outputs. As in, how to map the probability outputs and the class labels as how flow_from_directory creates one-hot vectors is not known in prior.

当使用 flow_from_directory 时，问题是如何解释概率输出。比如，如何将概率输出和类标签映射为 flow_from_directory 如何创建 one-hot 向量在之前是未知的。

We can get a dictionary that maps the class labels to the index of the prediction vector that we get as the output when we use

我们可以得到一个字典，将类标签映射到我们使用时作为输出得到的预测向量的索引

generator= train_datagen.flow_from_directory("train", batch_size=batch_size)
label_map = (generator.class_indices)

The label_map variable is a dictionary like this

label_map 变量是这样的字典

{'class_14': 5, 'class_10': 1, 'class_11': 2, 'class_12': 3, 'class_13': 4, 'class_2': 6, 'class_3': 7, 'class_1': 0, 'class_6': 10, 'class_7': 11, 'class_4': 8, 'class_5': 9, 'class_8': 12, 'class_9': 13}

Then from this the relation can be derived between the probability scores and class names.

然后由此可以推导出概率分数和类名之间的关系。

Basically, you can create this dictionary by this code.

基本上，您可以通过此代码创建此字典。

from glob import glob
class_names = glob("*") # Reads all the folders in which images are present
class_names = sorted(class_names) # Sorting them
name_id_map = dict(zip(class_names, range(len(class_names))))

The variable name_id_map in the above code also contains the same dictionary as the one obtained from class_indices function of flow_from_directory.

上面代码中的变量name_id_map也包含与flow_from_directory的class_indices函数获得的字典相同的字典。

Hope this helps!

希望这可以帮助！

Answer 3

回答by Bohumir Zamecnik

UPDATE: This is no longer valid for newer Keras versions. Please use argmax()as in the answer from Emilia Apostolova.

更新：这对较新的 Keras 版本不再有效。请使用argmax()Emilia Apostolova 的回答。

The functional API models have just the predict()function which for classification would return the class probabilities. You can then select the most probable classes using the probas_to_classes()utility function. Example:

函数式 API 模型仅predict()具有用于分类将返回类概率的函数。然后您可以使用probas_to_classes()效用函数选择最可能的类。例子：

y_proba = model.predict(x)
y_classes = keras.np_utils.probas_to_classes(y_proba)

This is equivalent to model.predict_classes(x)on the Sequential model.

这相当于model.predict_classes(x)在 Sequential 模型上。

The reason for this is that the functional API support more general class of tasks where predict_classes()would not make sense.

这样做的原因是函数式 API 支持更一般的任务类，而这些任务predict_classes()没有意义。

More info: https://github.com/fchollet/keras/issues/2524

更多信息：https: //github.com/fchollet/keras/issues/2524

Answer 4

回答by Hemerson Tacon

In addition to @Emilia Apostolova answer to get the ground truth labels, from

除了@Emilia Apostolova 回答以获取基本事实标签之外，来自

generator = train_datagen.flow_from_directory("train", batch_size=batch_size)

just call

打电话

y_true_labels = generator.classes

Answer 5

回答by Thomas Decaux

You must use the labels index you have, here what I do for text classification:

您必须使用您拥有的标签索引，这里是我为文本分类所做的：

# data labels = [1, 2, 1...]
labels_index = { "website" : 0, "money" : 1 ....} 
# to feed model
label_categories = to_categorical(np.asarray(labels))

Then, for predictions:

然后，对于预测：

texts = ["hello, rejoins moi sur skype", "bonjour comment ?a va ?", "tu me donnes de l'argent"]

sequences = tokenizer.texts_to_sequences(texts)

data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)

predictions = model.predict(data)

t = 0

for text in texts:
    i = 0
    print("Prediction for \"%s\": " % (text))
    for label in labels_index:
        print("\t%s ==> %f" % (label, predictions[t][i]))
        i = i + 1
    t = t + 1

This gives:

这给出：

Prediction for "hello, rejoins moi sur skype": 
    website ==> 0.759483
    money ==> 0.037091
    under ==> 0.010587
    camsite ==> 0.114436
    email ==> 0.075975
    abuse ==> 0.002428
Prediction for "bonjour comment ?a va ?": 
    website ==> 0.433079
    money ==> 0.084878
    under ==> 0.048375
    camsite ==> 0.036674
    email ==> 0.369197
    abuse ==> 0.027798
Prediction for "tu me donnes de l'argent": 
    website ==> 0.006223
    money ==> 0.095308
    under ==> 0.003586
    camsite ==> 0.003115
    email ==> 0.884112
    abuse ==> 0.007655

Answer 6

回答by Fedor Petrov

It is possible to save a "list" of labels in keras model directly. This way the user who uses the model for predictions and does not have any other sources of information can perform the lookup himself. Here is a dummy example of how one can perform an "injection" of labels

可以直接在 keras 模型中保存标签“列表”。这样，使用模型进行预测并且没有任何其他信息来源的用户可以自己执行查找。这是一个如何执行标签“注入”的虚拟示例

# assume we get labels as list
labels = ["cat","dog","horse","tomato"]
# here we start building our model with input image 299x299 and one output layer
xx = Input(shape=(299,299,3))
flat = Flatten()(xx)
output = Dense(shape=(4))(flat)
# here we perform injection of labels
tf_labels = tf.constant([labels],dtype="string")
tf_labels = tf.tile(labels,[tf.shape(xx)[0],1])
output_labels = Lambda(lambda x: tf_labels,name="label_injection")(xx)
#and finaly creating a model
model=tf.keras.Model(xx,[output,output_labels])

When used for prediction, this model returns tensor of scores and tensot of string labels. Model like this can be saved to h5. In this case the file contains the labels. This model can also be exported to saved_model and used for serving in the cloud.

当用于预测时，该模型返回分数张量和字符串标签张量。这样的模型可以保存到h5。在这种情况下，文件包含标签。该模型也可以导出到saved_model，用于云端服务。

Answer 7

回答by Peter

To map predicted classes and filenames using ImageDataGenerator, I use:

要使用映射预测的类和文件名ImageDataGenerator，我使用：

# Data generator and prediction
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
        inputpath,
        target_size=(150, 150),
        batch_size=20,
        class_mode='categorical',
        shuffle=False)
pred = model.predict_generator(test_generator, steps=len(test_generator), verbose=0)
# Get classes by max element in np (as a list)
classes = list(np.argmax(pred, axis=1))
# Get filenames (set shuffle=false in generator is important)
filenames = test_generator.filenames

I can loop over predicted classes and the associated filename using:

我可以使用以下方法循环预测类和相关文件名：

for f in zip(classes, filenames):
    ...

Python 从 Keras 功能模型中获取类标签

提问by Ledzz

回答by Emilia Apostolova

回答by Lokesh Kumar

回答by Bohumir Zamecnik

回答by Hemerson Tacon

回答by Thomas Decaux

回答by Fedor Petrov

回答by Peter

相关推荐

最近更新

标签

Python 从 Keras 功能模型中获取类标签

提问by Ledzz

回答by Emilia Apostolova

回答by Lokesh Kumar

回答by Bohumir Zamecnik

回答by Hemerson Tacon

回答by Thomas Decaux

回答by Fedor Petrov

回答by Peter

相关推荐

Python Panda AssertionError 列传递，传递的数据有 2 列

Python ValueError：值的长度与索引的长度不匹配 | Pandas DataFrame.unique()

Python字典vs列表，哪个更快？

Python Pandas - 在 DataFrame 中的任何位置查找值的索引

相关推荐

最近更新

标签