Python Pandas 将多列零替换为 Nan

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45416684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:00:28  来源:igfitidea点击:

Python Pandas replace multiple columns zero to Nan

pythonpandasdataframedata-cleaning

提问by Wouter Dunnes

List with attributes of persons loaded into pandas dataframe df2. For cleanup I want to replace value zero (0or '0') by np.nan.

列出加载到 pandas 数据框的人员的属性df2。对于清理,我想将值零(0'0')替换为np.nan.

df2.dtypes

ID                   object
Name                 object
Weight              float64
Height              float64
BootSize             object
SuitSize             object
Type                 object
dtype: object

Working code to set value zero to np.nan:

将零值设置为的工作代码np.nan

df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan

Believe this can be done in a similar/shorter way:

相信这可以通过类似/更短的方式完成:

df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)

However the above does not work. The zero's remain in df2. How to tackle this?

但是,以上不起作用。零留在 df2 中。如何解决这个问题?

回答by jezrael

I think you need replaceby dict:

我想你需要replace通过dict

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})

回答by christk

You could use the 'replace' method and pass the values that you want to replace in a list as the first parameter along with the desired one as the second parameter:

您可以使用 'replace' 方法并将要在列表中替换的值作为第一个参数传递,并将所需的值作为第二个参数传递:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)

回答by Ayyasamy

data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)