pandas 如何在熊猫填充中使用“无”值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46283312/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:28:53  来源:igfitidea点击:

How to proceed with `None` value in pandas fillna

pythonpandas

提问by Andrii Furmanets

I have the following dictionary:

我有以下字典:

fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})

fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})

When I pass that dictionary to fillnaI see:

当我把那本字典传给fillna我时,我看到:

raise ValueError('must specify a fill method or value')\nValueError: must specify a fill method or value\n"

raise ValueError('必须指定一个填充方法或值')\nValueError: 必须指定一个填充方法或值\n"

It seems to me that it fails on Nonevalue.

在我看来,它在None价值上失败了。

I use pandas version 0.20.3.

我使用Pandas版本 0.20.3。

采纳答案by atwalsh

What type of data structure are you using? This works for a pandas Series:

你使用什么类型的数据结构?这适用于Pandas系列:

import pandas as pd

d = pd.Series({'first_name': 'Andrii', 'last_name':'Furmanets', 'created_at':None})
d = d.fillna('DATE')

回答by piRSquared

Setup
Consider the sample dataframe df

设置
考虑示例数据框df

df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

I can confirm the error

我可以确认错误

df.fillna(dict(A=1, B=None, C=4))
ValueError: must specify a fill method or value
ValueError: must specify a fill method or value

This happens because pandas is cycling through keys in the dictionary and executing a fillnafor each relevant column. If you look at the signature of the pd.Series.fillnamethod

发生这种情况是因为 Pandas 正在遍历字典中的键并fillna为每个相关列执行 a 。如果你看pd.Series.fillna方法的签名

Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

You'll see the default value is None. So we can replicate this error with

您将看到默认值为None。所以我们可以复制这个错误

df.A.fillna(None)

Or equivalently

或等效地

df.A.fillna()

I'll add that I'm not terribly surprised considering that you are attempting to fill a null value with a null value.

我要补充一点,考虑到您试图用空值填充空值,我并不感到非常惊讶。



What you need is a work around

你需要的是一个解决方法

Solution
Use pd.DataFrame.fillnaover columns that you want to fill with non-null values. Then follow that up with a pd.DataFrame.replaceon the specific columns you want to swap one null value with another.

解决方案在您想要填充非空值的列上
使用pd.DataFrame.fillna。然后pd.DataFrame.replace在您希望将一个空值与另一个空值交换的特定列上使用 a 。

df.fillna(dict(A=1, C=2)).replace(dict(B={np.nan: None}))

     A     B  C
0  1.0  None  2
1  1.0     2  D

回答by addicted

An alternative method to fillna with None. I am on pandas 0.24.0and I am doing this to insert NULL values to POSTGRES database.

用 .fillna 填充的另一种方法None。我在Pandas上0.24.0,我这样做是为了将 NULL 值插入到 POSTGRES 数据库中。

# Stealing @pIRSquared dataframe
df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

# fill NaN with None. Basically it says, fill with None whenever you see NULL value.
df['A'] = np.where(df['A'].isnull(), None, df['A'])
df['B'] = np.where(df['B'].isnull(), None, df['B'])

# Result
df

     A    B     C
0  1.0  None  None
1  None  2.0     D

回答by smci

It's a bad idea to try to fill a datetime with None, this is exactly what pandas NaT(NotATime), is for: for missing datetimes.

尝试用 填充日期时间是一个坏主意None,这正是pandasNaT(NotATime) 的用途:用于缺少日期时间。