pandas 如何用熊猫列的最大值替换无限值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/50773107/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:40:23  来源:igfitidea点击:

how to replace infinite value with maximum value of a pandas column?

pythonpandasreplaceinfinite

提问by Moses Soleman

I have a dataframe which looks like

我有一个看起来像的数据框

City   Crime_Rate

A      10

B      20 

C      inf

D      15 

I want to replace the inf with the max value of the Crime_Rate column , so that my resulting dataframe should look like

我想用 Crime_Rate 列的最大值替换 inf,这样我的结果数据框应该看起来像

City   Crime_Rate

A      10

B      20 

C      20

D      15

I tried

我试过

df['Crime_Rate'].replace([np.inf],max(df['Crime_Rate']),inplace=True)

But python takes inf as the maximum value , where am I going wrong here ?

但是python将inf作为最大值,我这里哪里出错了?

回答by jezrael

Filter out infvalues first and then get maxof Series:

过滤掉inf第一个值,然后得到maxSeries

m = df.loc[df['Crime_Rate'] != np.inf, 'Crime_Rate'].max()
df['Crime_Rate'].replace(np.inf,m,inplace=True)

Another solution:

另一种解决方案:

mask = df['Crime_Rate'] != np.inf
df.loc[~mask, 'Crime_Rate'] = df.loc[mask, 'Crime_Rate'].max()

print (df)
  City  Crime_Rate
0    A        10.0
1    B        20.0
2    C        20.0
3    D        15.0

回答by Bharath

Set use_inf_as_nanto true and then use fillna. (Use this if you want to consider infand nanboth as missing value) i.e

设置use_inf_as_nan为true,然后使用fillna。(如果你要考虑使用此infnan同时作为缺失值),即

pd.options.mode.use_inf_as_na = True

df['Crime_Rate'].fillna(df['Crime_Rate'].max(),inplace=True)

   City  Crime_Rate
0    A        10.0
1    B        20.0
2    C        20.0
3    D        15.0

回答by dmeu

Here is a solution for a whole matrix/data frame:

这是整个矩阵/数据框的解决方案:

highest_non_inf = df.max().loc[lambda v: v<np.Inf].max() df.replace(np.Inf, highest_non_inf)

highest_non_inf = df.max().loc[lambda v: v<np.Inf].max() df.replace(np.Inf, highest_non_inf)

回答by Ravijeet

One way to do it using an additional function replace(np.inf, np.nan)within max().

使用max() 中的附加函数replace(np.inf, np.nan)来做到这一点的一种方法。

It replaces inf with nan for the operations happening inside max() and max returns the expected maximum value not inf

对于在 max() 内发生的操作,它将 inf 替换为 nan 并且 max 返回预期的最大值而不是 inf

Example below : Max value is 100 and replaces inf

下面的示例:最大值为 100 并替换 inf

#Create dummy data frame
import pandas as pd 
import numpy as np  
a = float('Inf')
v = [1,2,5,a,10,5,a,5,100,2]  
df = pd.DataFrame({'Col_A': v})
#Data frame looks like this
In [33]: df
Out[33]: 
        Col_A
0    1.000000
1    2.000000
2    5.000000
3         inf
4   10.000000
5    5.000000
6         inf
7    5.000000
8  100.000000
9    2.000000

# Replace inf  
df['Col_A'].replace([np.inf],max(df['Col_A'].replace(np.inf, 
np.nan)),inplace=True)

In[35]: df
Out[35]: 
   Col_A
0    1.0
1    2.0
2    5.0
3  100.0
4   10.0
5    5.0
6  100.0
7    5.0
8  100.0
9    2.0

Hope that works !

希望有效!