pandas 如何用熊猫列的最大值替换无限值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50773107/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to replace infinite value with maximum value of a pandas column?
提问by Moses Soleman
I have a dataframe which looks like
我有一个看起来像的数据框
City Crime_Rate
A 10
B 20
C inf
D 15
I want to replace the inf with the max value of the Crime_Rate column , so that my resulting dataframe should look like
我想用 Crime_Rate 列的最大值替换 inf,这样我的结果数据框应该看起来像
City Crime_Rate
A 10
B 20
C 20
D 15
I tried
我试过
df['Crime_Rate'].replace([np.inf],max(df['Crime_Rate']),inplace=True)
But python takes inf as the maximum value , where am I going wrong here ?
但是python将inf作为最大值,我这里哪里出错了?
回答by jezrael
Filter out inf
values first and then get max
of Series
:
过滤掉inf
第一个值,然后得到max
的Series
:
m = df.loc[df['Crime_Rate'] != np.inf, 'Crime_Rate'].max()
df['Crime_Rate'].replace(np.inf,m,inplace=True)
Another solution:
另一种解决方案:
mask = df['Crime_Rate'] != np.inf
df.loc[~mask, 'Crime_Rate'] = df.loc[mask, 'Crime_Rate'].max()
print (df)
City Crime_Rate
0 A 10.0
1 B 20.0
2 C 20.0
3 D 15.0
回答by Bharath
Set use_inf_as_nan
to true and then use fillna. (Use this if you want to consider inf
and nan
both as missing value) i.e
设置use_inf_as_nan
为true,然后使用fillna。(如果你要考虑使用此inf
与nan
同时作为缺失值),即
pd.options.mode.use_inf_as_na = True
df['Crime_Rate'].fillna(df['Crime_Rate'].max(),inplace=True)
City Crime_Rate
0 A 10.0
1 B 20.0
2 C 20.0
3 D 15.0
回答by dmeu
Here is a solution for a whole matrix/data frame:
这是整个矩阵/数据框的解决方案:
highest_non_inf = df.max().loc[lambda v: v<np.Inf].max()
df.replace(np.Inf, highest_non_inf)
highest_non_inf = df.max().loc[lambda v: v<np.Inf].max()
df.replace(np.Inf, highest_non_inf)
回答by Ravijeet
One way to do it using an additional function replace(np.inf, np.nan)within max().
使用max() 中的附加函数replace(np.inf, np.nan)来做到这一点的一种方法。
It replaces inf with nan for the operations happening inside max() and max returns the expected maximum value not inf
对于在 max() 内发生的操作,它将 inf 替换为 nan 并且 max 返回预期的最大值而不是 inf
Example below : Max value is 100 and replaces inf
下面的示例:最大值为 100 并替换 inf
#Create dummy data frame
import pandas as pd
import numpy as np
a = float('Inf')
v = [1,2,5,a,10,5,a,5,100,2]
df = pd.DataFrame({'Col_A': v})
#Data frame looks like this
In [33]: df
Out[33]:
Col_A
0 1.000000
1 2.000000
2 5.000000
3 inf
4 10.000000
5 5.000000
6 inf
7 5.000000
8 100.000000
9 2.000000
# Replace inf
df['Col_A'].replace([np.inf],max(df['Col_A'].replace(np.inf,
np.nan)),inplace=True)
In[35]: df
Out[35]:
Col_A
0 1.0
1 2.0
2 5.0
3 100.0
4 10.0
5 5.0
6 100.0
7 5.0
8 100.0
9 2.0
Hope that works !
希望有效!