在 Python 中处理 Pandas DataFrames 列分区中的零

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16244180/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:47:23  来源:igfitidea点击:

handling zeros in pandas DataFrames column divisions in Python

pythonnumpypandasdataframe

提问by Jeff

What's the best way to handle zero denominators when dividing pandas DataFrame columns by each other in Python? for example:

在 Python 中将 Pandas DataFrame 列彼此分开时,处理零分母的最佳方法是什么?例如:

df = pandas.DataFrame({"a": [1, 2, 0, 1, 5], "b": [0, 10, 20, 30, 50]})
df.a / df.b  # yields error

I'd like the ratios where the denominator is zero to be registered as NA (numpy.nan). How can this be done efficiently in pandas?

我希望将分母为零的比率注册为 NA ( numpy.nan)。如何在Pandas中有效地做到这一点?

Casting to float64does not work at level of columns:

强制转换float64在列级别不起作用:

In [29]: df
Out[29]: 
   a   b
0  1   0
1  2  10
2  0  20
3  1  30
4  5  50

In [30]: df["a"].astype("float64") / df["b"].astype("float64")
...

FloatingPointError: divide by zero encountered in divide

How can I do it just for particular columns and not entire df?

我怎么能只针对特定的列而不是整个 df?

回答by Jeff

You need to work in floats, otherwise you will have integer division, prob not what you want

您需要在浮点数中工作,否则您将进行整数除法,概率不是您想要的

In [12]: df = pandas.DataFrame({"a": [1, 2, 0, 1, 5], 
                                "b": [0, 10, 20, 30, 50]}).astype('float64')

In [13]: df
Out[13]: 
   a   b
0  1   0
1  2  10
2  0  20
3  1  30
4  5  50

In [14]: df.dtypes
Out[14]: 
a    float64
b    float64
dtype: object

Here's one way

这是一种方法

In [15]: x = df.a/df.b

In [16]: x
Out[16]: 
0         inf
1    0.200000
2    0.000000
3    0.033333
4    0.100000
dtype: float64

In [17]: x[np.isinf(x)] = np.nan

In [18]: x
Out[18]: 
0         NaN
1    0.200000
2    0.000000
3    0.033333
4    0.100000
dtype: float64

Here's another way

这是另一种方式

In [20]: df.a/df.b.replace({ 0 : np.nan })
Out[20]: 
0         NaN
1    0.200000
2    0.000000
3    0.033333
4    0.100000
dtype: float64