Python 不平衡数据和加权交叉熵
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44560549/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Unbalanced data and weighted cross entropy
提问by Sergiodiaz53
I'm trying to train a network with a unbalanced data. I have A (198 samples), B (436 samples), C (710 samples), D (272 samples) and I have read about the "weighted_cross_entropy_with_logits" but all the examples I found are for binary classification so I'm not very confident in how to set those weights.
我正在尝试使用不平衡数据训练网络。我有 A(198 个样本)、B(436 个样本)、C(710 个样本)、D(272 个样本)并且我已经阅读了有关“weighted_cross_entropy_with_logits”的内容,但是我找到的所有示例都是用于二元分类的,所以我不是很对如何设置这些权重充满信心。
Total samples: 1616
总样本数:1616
A_weight: 198/1616 = 0.12?
A_权重:198/1616 = 0.12?
The idea behind, if I understood, is penalize the errors of the mayority class and value more positively the hits in the minority one, right?
如果我理解,背后的想法是惩罚市长级别的错误,并更积极地评价少数人的命中率,对吗?
My piece of code:
我的一段代码:
weights = tf.constant([0.12, 0.26, 0.43, 0.17])
cost = tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=pred, targets=y, pos_weight=weights))
I have read this oneand others examples with binary classification but still not very clear.
我已经阅读了这个和其他二进制分类的例子,但仍然不是很清楚。
Thanks in advance.
提前致谢。
回答by P-Gn
Note that weighted_cross_entropy_with_logits
is the weighted variant of sigmoid_cross_entropy_with_logits
. Sigmoid cross entropy is typically used for binaryclassification. Yes, it can handle multiple labels, but sigmoid cross entropy basically makes a (binary) decision on each of them -- for example, for a face recognition net, those (not mutually exclusive) labels could be "Does the subject wear glasses?", "Is the subject female?", etc.
请注意,这weighted_cross_entropy_with_logits
是 的加权变体sigmoid_cross_entropy_with_logits
。Sigmoid 交叉熵通常用于二元分类。是的,它可以处理多个标签,但 sigmoid 交叉熵基本上对每个标签做出(二元)决策——例如,对于人脸识别网络,那些(不相互排斥的)标签可能是“对象戴眼镜吗?”、“对象是女性吗?”等。
In binary classification(s), each output channel corresponds to a binary (soft) decision. Therefore, the weighting needs to happen within the computation of the loss. This is what weighted_cross_entropy_with_logits
does, by weighting one term of the cross-entropy over the other.
在二元分类中,每个输出通道对应一个二元(软)决策。因此,需要在损失计算中进行加权。这就是weighted_cross_entropy_with_logits
通过将交叉熵的一项加权在另一项之上来实现的。
In mutually exclusive multilabel classification, we use softmax_cross_entropy_with_logits
, which behaves differently: each output channel corresponds to the score of a class candidate. The decision comes after, by comparing the respective outputs of each channel.
在互斥的多标签分类中,我们使用softmax_cross_entropy_with_logits
,其表现不同:每个输出通道对应于类候选的分数。通过比较每个通道的相应输出,在 之后做出决定。
Weighting in before the final decision is therefore a simple matter of modifying the scores before comparing them, typically by multiplication with weights. For example, for a ternary classification task,
因此,在最终决定之前加权是一个简单的事情,即在比较分数之前修改分数,通常是乘以权重。例如,对于三元分类任务,
# your class weights
class_weights = tf.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)
You could also rely on tf.losses.softmax_cross_entropy
to handle the last three steps.
您还可以依靠tf.losses.softmax_cross_entropy
处理最后三个步骤。
In your case, where you need to tackle data imbalance, the class weights could indeed be inversely proportional to their frequency in your train data. Normalizing them so that they sum up to one or to the number of classes also makes sense.
在您的情况下,您需要解决数据不平衡问题,类权重确实可能与它们在训练数据中的频率成反比。将它们归一化使它们总和为一个或类的数量也是有意义的。
Note that in the above, we penalized the loss based on the true label of the samples. We could also have penalized the loss based on the estimatedlabels by simply defining
请注意,在上面,我们根据样本的真实标签对损失进行了惩罚。我们也可以通过简单地定义基于估计的标签来惩罚损失
weights = class_weights
and the rest of the code need not change thanks to broadcasting magic.
由于广播魔法,其余的代码不需要更改。
In the general case, you would want weights that depend on the kind of error you make. In other words, for each pair of labels X
and Y
, you could choose how to penalize choosing label X
when the true label is Y
. You end up with a whole prior weight matrix, which results in weights
above being a full (num_samples, num_classes)
tensor. This goes a bit beyond what you want, but it might be useful to know nonetheless that only your definition of the weight tensor need to change in the code above.
在一般情况下,您希望权重取决于您所犯的错误类型。换句话说,对于每对标签X
和Y
,X
当真实标签为 时,您可以选择如何惩罚选择标签Y
。你最终得到一个完整的先验权重矩阵,这导致weights
上面是一个完整的(num_samples, num_classes)
张量。这有点超出了您的要求,但是知道在上面的代码中只有您对权重张量的定义需要更改可能会很有用。
回答by DankMasterDan
See this answerfor an alternate solution which works with sparse_softmax_cross_entropy:
有关适用于 sparse_softmax_cross_entropy 的替代解决方案,请参阅此答案:
import tensorflow as tf
import numpy as np
np.random.seed(123)
sess = tf.InteractiveSession()
# let's say we have the logits and labels of a batch of size 6 with 5 classes
logits = tf.constant(np.random.randint(0, 10, 30).reshape(6, 5), dtype=tf.float32)
labels = tf.constant(np.random.randint(0, 5, 6), dtype=tf.int32)
# specify some class weightings
class_weights = tf.constant([0.3, 0.1, 0.2, 0.3, 0.1])
# specify the weights for each sample in the batch (without having to compute the onehot label matrix)
weights = tf.gather(class_weights, labels)
# compute the loss
tf.losses.sparse_softmax_cross_entropy(labels, logits, weights).eval()
回答by Tensorflow Support
Tensorflow 2.0 Compatible Answer: Migrating the Code specified in P-Gn's Answer to 2.0, for the benefit of the community.
Tensorflow 2.0 兼容答案:为了社区的利益,将 P-Gn 的答案中指定的代码迁移到 2.0。
# your class weights
class_weights = tf.compat.v2.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.compat.v2.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.compat.v2.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)
For more information about migration of code from Tensorflow Version 1.x to 2.x, please refer this Migration Guide.