pandas 如何替换熊猫数据框中的非整数值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42929997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:14:58  来源:igfitidea点击:

How to replace non integer values in a pandas Dataframe?

pythonpandasdataframe

提问by latish

I have a dataframe consisting of two columns, Age and Salary

我有一个由两列组成的数据框,年龄和薪水

Age   Salary
21    25000
22    30000
22    Fresher
23    2,50,000
24    25 LPA
35    400000
45    10,00,000

How to handle outliers in Salary column and replace them with an integer?

如何处理薪水列中的异常值并用整数替换它们?

回答by jezrael

If need replace non numeric values use to_numericwith parameter errors='coerce':

如果需要to_numeric用参数替换非数值使用errors='coerce'

df['new'] = pd.to_numeric(df.Salary.astype(str).str.replace(',',''), errors='coerce')
              .fillna(0)
              .astype(int)
print (df)
   Age     Salary      new
0   21      25000    25000
1   22      30000    30000
2   22    Fresher        0
3   23   2,50,000   250000
4   24     25 LPA        0
5   35     400000   400000
6   45  10,00,000  1000000

回答by Shenglin Chen

Use numpy where to find non digit value, replace with '0'.

使用 numpy where 查找非数字值,替换为“0”。

df['New']=df.Salary.apply(lambda x: np.where(x.isdigit(),x,'0'))

回答by Arash

If you use Python 3 use the following. I am not sure how other Python versions return type(x). However I would not replace missing or inconsistent values with 0, it is better to replace them with None. But let's say you want to replace string values (outliers or inconsistent values) with 0 :

如果您使用 Python 3,请使用以下内容。我不确定其他 Python 版本如何返回 type(x)。但是我不会用 0 替换缺失或不一致的值,最好用 None 替换它们。但是假设您想用 0 替换字符串值(异常值或不一致的值):

df['Salary']=df['Salary'].apply(lambda x: 0 if str(type(x))=="<class 'str'>" else x)