pandas 如何替换熊猫数据框中的非整数值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42929997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to replace non integer values in a pandas Dataframe?
提问by latish
I have a dataframe consisting of two columns, Age and Salary
我有一个由两列组成的数据框,年龄和薪水
Age Salary
21 25000
22 30000
22 Fresher
23 2,50,000
24 25 LPA
35 400000
45 10,00,000
How to handle outliers in Salary column and replace them with an integer?
如何处理薪水列中的异常值并用整数替换它们?
回答by jezrael
If need replace non numeric values use to_numeric
with parameter errors='coerce'
:
如果需要to_numeric
用参数替换非数值使用errors='coerce'
:
df['new'] = pd.to_numeric(df.Salary.astype(str).str.replace(',',''), errors='coerce')
.fillna(0)
.astype(int)
print (df)
Age Salary new
0 21 25000 25000
1 22 30000 30000
2 22 Fresher 0
3 23 2,50,000 250000
4 24 25 LPA 0
5 35 400000 400000
6 45 10,00,000 1000000
回答by Shenglin Chen
Use numpy where to find non digit value, replace with '0'.
使用 numpy where 查找非数字值,替换为“0”。
df['New']=df.Salary.apply(lambda x: np.where(x.isdigit(),x,'0'))
回答by Arash
If you use Python 3 use the following. I am not sure how other Python versions return type(x). However I would not replace missing or inconsistent values with 0, it is better to replace them with None. But let's say you want to replace string values (outliers or inconsistent values) with 0 :
如果您使用 Python 3,请使用以下内容。我不确定其他 Python 版本如何返回 type(x)。但是我不会用 0 替换缺失或不一致的值,最好用 None 替换它们。但是假设您想用 0 替换字符串值(异常值或不一致的值):
df['Salary']=df['Salary'].apply(lambda x: 0 if str(type(x))=="<class 'str'>" else x)