Python 操作列时如何使用熊猫数据框处理“除以零”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38886512/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:35:27  来源:igfitidea点击:

How to deal with "divide by zero" with pandas dataframes when manipulating columns?

pythonpython-3.xpandasdataframe

提问by ShanZhengYang

I'm working with hundreds of pandas dataframes. A typical dataframe is as follows:

我正在使用数百个熊猫数据框。一个典型的数据框如下:

import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)
df 

        one       two     three  four   five
a  0.469112 -0.282863 -1.509059  bar   True
b  0.932424  1.224234  7.823421  bar  False
c -1.135632  1.212112 -0.173215  bar  False
d  0.232424  2.342112  0.982342  unbar True
e  0.119209 -1.044236 -0.861849  bar   True
f -2.104569 -0.494929  1.071804  bar  False
....

There are certain operations whereby I'm dividing between columns values, e.g.

有某些操作我可以在列值之间进行划分,例如

df['one']/df['two'] 

However, there are times where I am dividing by zero, or perhaps both

但是,有时我会除以零,或者两者都除

df['one'] = 0
df['two'] = 0

Naturally, this outputs the error:

自然,这会输出错误:

ZeroDivisionError: division by zero

I would prefer for 0/0 to actually mean "there's nothing here", as this is often what such a zero means in a dataframe.

我更希望 0/0 实际上意味着“这里什么都没有”,因为这通常是数据帧中这样的零的含义。

(a) How would I code this to mean "divide by zero" is 0 ?

(a) 我将如何编码这意味着“除以零”是 0 ?

(b) How would I code this to "pass" if divide by zero is encountered?

(b) 如果遇到除以零,我将如何将其编码为“通过”?

采纳答案by vielmetti

Two approaches to consider:

需要考虑的两种方法:

Prepare your data so that never has a divide by zero situation, by explicitly coding a "no data" value and testing for that.

通过明确编码“无数据”值并为此进行测试,准备好您的数据,以便永远不会出现除以零的情况。

Wrap each division that might result in an error with a try/exceptpair, as described at https://wiki.python.org/moin/HandlingExceptions(which has a divide by zero example to use)

https://wiki.python.org/moin/HandlingExceptions 中所述,将可能导致错误的每个分区用try/except对包裹起来(使用除以零示例)

(x,y) = (5,0)
try:
  z = x/y
except ZeroDivisionError:
  print "divide by zero"

I worry about the situation where your data includes a zero that's really a zero (and not a missing value).

我担心您的数据包含一个实际上是零(而不是缺失值)的零的情况。

回答by Alexander

It would probably be more useful to use a dataframe that actually has zero in the denominator (see the last row of column two).

使用分母实际上为零的数据帧可能更有用(请参阅 column 的最后一行two)。

        one       two     three   four   five
a  0.469112 -0.282863 -1.509059    bar   True
b  0.932424  1.224234  7.823421    bar  False
c -1.135632  1.212112 -0.173215    bar  False
d  0.232424  2.342112  0.982342  unbar   True
e  0.119209 -1.044236 -0.861849    bar   True
f -2.104569  0.000000  1.071804    bar  False

>>> df.one / df.two
a   -1.658442
b    0.761639
c   -0.936904
d    0.099237
e   -0.114159
f        -inf  # <<< Note division by zero
dtype: float64

When one of the values is zero, you should get infor -infin the result. One way to convert these values is as follows:

当其中一个值为零时,您应该得到inf-inf在结果中。转换这些值的一种方法如下:

df['result'] = df.one.div(df.two)

df.loc[~np.isfinite(df['result']), 'result'] = np.nan  # Or = 0 per part a) of question.
# or df.loc[np.isinf(df['result']), ...

>>> df
        one       two     three   four   five    result
a  0.469112 -0.282863 -1.509059    bar   True -1.658442
b  0.932424  1.224234  7.823421    bar  False  0.761639
c -1.135632  1.212112 -0.173215    bar  False -0.936904
d  0.232424  2.342112  0.982342  unbar   True  0.099237
e  0.119209 -1.044236 -0.861849    bar   True -0.114159
f -2.104569  0.000000  1.071804    bar  False       NaN

回答by Kartik

df['one'].divide(df['two'])


Code:

代码:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(5,2), columns=list('ab'))
df.loc[[1,3], 'b'] = 0
print(df)

print(df['a'].divide(df['b']))

Result:

结果:

    a           b
0   0.517925    0.305973
1   0.900899    0.000000
2   0.414219    0.781512
3   0.516072    0.000000
4   0.841636    0.166157

0    1.692717
1         inf
2    0.530023
3         inf
4    5.065297
dtype: float64

回答by Christian

You can always use a try statement:

您始终可以使用 try 语句:

try:
  z = var1/var2
except ZeroDivisionError:
  print ("0") #As python-3's rule is: Parentheses

OR...

或者...

You can also do:

你也可以这样做:

if var1==0:
    if var2==0:
        print("0")
else:
    var3 = var1/var2

Hope this helped! Choose whichever choice you desire (they're both the same anyways).

希望这有帮助!选择您想要的任何选择(无论如何它们都是相同的)。

回答by Merlin

Try this:

尝试这个:

df['one']/(df['two'] +.000000001)