Python 操作列时如何使用熊猫数据框处理“除以零”？

Question

提问by ShanZhengYang

I'm working with hundreds of pandas dataframes. A typical dataframe is as follows:

我正在使用数百个熊猫数据框。一个典型的数据框如下：

import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)
df 

        one       two     three  four   five
a  0.469112 -0.282863 -1.509059  bar   True
b  0.932424  1.224234  7.823421  bar  False
c -1.135632  1.212112 -0.173215  bar  False
d  0.232424  2.342112  0.982342  unbar True
e  0.119209 -1.044236 -0.861849  bar   True
f -2.104569 -0.494929  1.071804  bar  False
....

There are certain operations whereby I'm dividing between columns values, e.g.

有某些操作我可以在列值之间进行划分，例如

df['one']/df['two']

However, there are times where I am dividing by zero, or perhaps both

但是，有时我会除以零，或者两者都除

df['one'] = 0
df['two'] = 0

Naturally, this outputs the error:

自然，这会输出错误：

ZeroDivisionError: division by zero

I would prefer for 0/0 to actually mean "there's nothing here", as this is often what such a zero means in a dataframe.

我更希望 0/0 实际上意味着“这里什么都没有”，因为这通常是数据帧中这样的零的含义。

(a) How would I code this to mean "divide by zero" is 0 ?

(a) 我将如何编码这意味着“除以零”是 0 ？

(b) How would I code this to "pass" if divide by zero is encountered?

(b) 如果遇到除以零，我将如何将其编码为“通过”？

Answer 1

采纳答案by vielmetti

Two approaches to consider:

需要考虑的两种方法：

Prepare your data so that never has a divide by zero situation, by explicitly coding a "no data" value and testing for that.

通过明确编码“无数据”值并为此进行测试，准备好您的数据，以便永远不会出现除以零的情况。

Wrap each division that might result in an error with a try/exceptpair, as described at https://wiki.python.org/moin/HandlingExceptions(which has a divide by zero example to use)

如https://wiki.python.org/moin/HandlingExceptions 中所述，将可能导致错误的每个分区用try/except对包裹起来（使用除以零示例）

(x,y) = (5,0)
try:
  z = x/y
except ZeroDivisionError:
  print "divide by zero"

I worry about the situation where your data includes a zero that's really a zero (and not a missing value).

我担心您的数据包含一个实际上是零（而不是缺失值）的零的情况。

Answer 2

回答by Alexander

It would probably be more useful to use a dataframe that actually has zero in the denominator (see the last row of column two).

使用分母实际上为零的数据帧可能更有用（请参阅 column 的最后一行two）。

        one       two     three   four   five
a  0.469112 -0.282863 -1.509059    bar   True
b  0.932424  1.224234  7.823421    bar  False
c -1.135632  1.212112 -0.173215    bar  False
d  0.232424  2.342112  0.982342  unbar   True
e  0.119209 -1.044236 -0.861849    bar   True
f -2.104569  0.000000  1.071804    bar  False

>>> df.one / df.two
a   -1.658442
b    0.761639
c   -0.936904
d    0.099237
e   -0.114159
f        -inf  # <<< Note division by zero
dtype: float64

When one of the values is zero, you should get infor -infin the result. One way to convert these values is as follows:

当其中一个值为零时，您应该得到inf或-inf在结果中。转换这些值的一种方法如下：

df['result'] = df.one.div(df.two)

df.loc[~np.isfinite(df['result']), 'result'] = np.nan  # Or = 0 per part a) of question.
# or df.loc[np.isinf(df['result']), ...

>>> df
        one       two     three   four   five    result
a  0.469112 -0.282863 -1.509059    bar   True -1.658442
b  0.932424  1.224234  7.823421    bar  False  0.761639
c -1.135632  1.212112 -0.173215    bar  False -0.936904
d  0.232424  2.342112  0.982342  unbar   True  0.099237
e  0.119209 -1.044236 -0.861849    bar   True -0.114159
f -2.104569  0.000000  1.071804    bar  False       NaN

Answer 3

回答by Kartik

df['one'].divide(df['two'])

Code:

代码：

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(5,2), columns=list('ab'))
df.loc[[1,3], 'b'] = 0
print(df)

print(df['a'].divide(df['b']))

Result:

结果：

    a           b
0   0.517925    0.305973
1   0.900899    0.000000
2   0.414219    0.781512
3   0.516072    0.000000
4   0.841636    0.166157

0    1.692717
1         inf
2    0.530023
3         inf
4    5.065297
dtype: float64

Answer 4

回答by Christian

You can always use a try statement:

您始终可以使用 try 语句：

try:
  z = var1/var2
except ZeroDivisionError:
  print ("0") #As python-3's rule is: Parentheses

OR...

或者...

You can also do:

你也可以这样做：

if var1==0:
    if var2==0:
        print("0")
else:
    var3 = var1/var2

Hope this helped! Choose whichever choice you desire (they're both the same anyways).

希望这有帮助！选择您想要的任何选择（无论如何它们都是相同的）。

Answer 5

回答by Merlin

Try this:

尝试这个：

df['one']/(df['two'] +.000000001)

Python 操作列时如何使用熊猫数据框处理“除以零”？

提问by ShanZhengYang

采纳答案by vielmetti

回答by Alexander

回答by Kartik

回答by Christian

回答by Merlin

相关推荐

最近更新

标签

Python 操作列时如何使用熊猫数据框处理“除以零”？

提问by ShanZhengYang

采纳答案by vielmetti

回答by Alexander

回答by Kartik

回答by Christian

回答by Merlin

相关推荐

Python 中的简单多线程 for 循环

使用 opencv Python 去除图像的背景

Python：字符串替换索引

Python 迭代pyspark数据框列

相关推荐

最近更新

标签