Python 不除以零时,在日志中遇到“除以零”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36229340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
"Divide by zero encountered in log" when not dividing by zero
提问by Jobs
When I do:
当我做:
summing += yval * np.log(sigmoid(np.dot(w.transpose(), xi.transpose()))) + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))
where there is no division, why do I get a "divide by zero encountered in log" error? As a result, summing
becomes [nan]
.
在没有除法的情况下,为什么会出现“日志中遇到除以零”错误?结果,summing
变成了[nan]
。
回答by DevShark
That's the warning you get when you try to evaluate log with 0:
这是当您尝试使用 0 评估 log 时收到的警告:
>>> import numpy as np
>>> np.log(0)
__main__:1: RuntimeWarning: divide by zero encountered in log
I agree it's not very clear.
我同意这不是很清楚。
So in your case, I would check why your input to log is 0.
因此,在您的情况下,我会检查为什么您对 log 的输入为 0。
PS: this is on numpy 1.10.4
PS:这是在 numpy 1.10.4
回答by Seth
I had this same problem. It looks like you're trying to do logistic regression. I was doing MULTI-CLASS Classification with logistic regression. But you need to solve this problem using the ONE VS ALL approach (google for details).
我有同样的问题。看起来您正在尝试进行逻辑回归。我正在使用逻辑回归进行多类分类。但是您需要使用 ONE VS ALL 方法(谷歌了解详情)来解决这个问题。
If you don't set your yval variable so that only has '1' and '0' instead of yval = [1,2,3,4,...] etc., then you will get negative costs which lead to runaway theta and then lead to you reaching the limit of log(y) where y is close to zero.
如果你没有设置你的 yval 变量,所以只有 '1' 和 '0' 而不是 yval = [1,2,3,4,...] 等,那么你会得到负成本,导致失控theta 然后导致您达到 log(y) 的极限,其中 y 接近于零。
The fix should be to pre-treat your yval variable so that it only has '1' and '0' for positive and negative examples.
解决方法应该是对您的 yval 变量进行预处理,以便它在正面和负面示例中只有“1”和“0”。
回答by Mayur
Even though it's late, this answer might help someone else.
即使已经晚了,这个答案也可能对其他人有所帮助。
In the part of your code.
在您的代码部分。
... + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))
may be the np.dot(w.transpose(), xi.transpose())
function is spitting larger values(above 40 or so), resulting in the output of sigmoid( )
to be 1
. And then you're basically taking np.log
of 1-1
that is 0
. And as DevSharkhas mentioned above, it causes the RuntimeWarning: Divide by zero...
error.
可以是np.dot(w.transpose(), xi.transpose())
功能吐痰较大的值(高于40左右),从而产生的输出sigmoid( )
是1
。然后你基本上接受np.log
的1-1
是0
. 正如DevShark上面提到的,它会导致RuntimeWarning: Divide by zero...
错误。
How I came up with the number 40 you might ask, well, it's just that for values above 40 or so sigmoid function in python(numpy) returns 1.
.
我是如何想出你可能会问的数字 40 的,嗯,只是对于超过 40 左右的值,python(numpy) 中的 sigmoid 函数返回1.
。
Looking at your implementation, it seems you're dealing with the Logistic Regression algorithm, in which case(I'm under the impression that) feature scaling is very important.
查看您的实现,似乎您正在处理逻辑回归算法,在这种情况下(我的印象是)特征缩放非常重要。
Since I'm writing answer for the first time, It is possible I may have violated some rules/regulations, if that is the case I'd like to apologise.
由于我是第一次写答案,我可能违反了一些规则/规定,如果是这种情况,我想道歉。
回答by u1516331
Try to add a very small value, e.g., 1e-7, to the input. For example, sklearn library has a parameter eps
for the log_loss function.
尝试在输入中添加一个非常小的值,例如 1e-7。例如,sklearn 库有一个参数eps
用于 log_loss 函数。
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/discussion/48701
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/discussion/48701