pandas 如何在熊猫填充中使用“无”值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46283312/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to proceed with `None` value in pandas fillna
提问by Andrii Furmanets
I have the following dictionary:
我有以下字典:
fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})
fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})
When I pass that dictionary to fillna
I see:
当我把那本字典传给fillna
我时,我看到:
raise ValueError('must specify a fill method or value')\nValueError: must specify a fill method or value\n"
raise ValueError('必须指定一个填充方法或值')\nValueError: 必须指定一个填充方法或值\n"
It seems to me that it fails on None
value.
在我看来,它在None
价值上失败了。
I use pandas version 0.20.3.
我使用Pandas版本 0.20.3。
采纳答案by atwalsh
What type of data structure are you using? This works for a pandas Series:
你使用什么类型的数据结构?这适用于Pandas系列:
import pandas as pd
d = pd.Series({'first_name': 'Andrii', 'last_name':'Furmanets', 'created_at':None})
d = d.fillna('DATE')
回答by piRSquared
Setup
Consider the sample dataframe df
设置
考虑示例数据框df
df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))
df
A B C
0 1.0 NaN None
1 NaN 2.0 D
I can confirm the error
我可以确认错误
df.fillna(dict(A=1, B=None, C=4))
ValueError: must specify a fill method or value
ValueError: must specify a fill method or value
This happens because pandas is cycling through keys in the dictionary and executing a fillna
for each relevant column. If you look at the signature of the pd.Series.fillna
method
发生这种情况是因为 Pandas 正在遍历字典中的键并fillna
为每个相关列执行 a 。如果你看pd.Series.fillna
方法的签名
Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
You'll see the default value is None
. So we can replicate this error with
您将看到默认值为None
。所以我们可以复制这个错误
df.A.fillna(None)
Or equivalently
或等效地
df.A.fillna()
I'll add that I'm not terribly surprised considering that you are attempting to fill a null value with a null value.
我要补充一点,考虑到您试图用空值填充空值,我并不感到非常惊讶。
What you need is a work around
你需要的是一个解决方法
Solution
Use pd.DataFrame.fillna
over columns that you want to fill with non-null values. Then follow that up with a pd.DataFrame.replace
on the specific columns you want to swap one null value with another.
解决方案在您想要填充非空值的列上
使用pd.DataFrame.fillna
。然后pd.DataFrame.replace
在您希望将一个空值与另一个空值交换的特定列上使用 a 。
df.fillna(dict(A=1, C=2)).replace(dict(B={np.nan: None}))
A B C
0 1.0 None 2
1 1.0 2 D
回答by addicted
An alternative method to fillna with None
. I am on pandas 0.24.0
and I am doing this to insert NULL values to POSTGRES database.
用 .fillna 填充的另一种方法None
。我在Pandas上0.24.0
,我这样做是为了将 NULL 值插入到 POSTGRES 数据库中。
# Stealing @pIRSquared dataframe
df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))
df
A B C
0 1.0 NaN None
1 NaN 2.0 D
# fill NaN with None. Basically it says, fill with None whenever you see NULL value.
df['A'] = np.where(df['A'].isnull(), None, df['A'])
df['B'] = np.where(df['B'].isnull(), None, df['B'])
# Result
df
A B C
0 1.0 None None
1 None 2.0 D
回答by smci
It's a bad idea to try to fill a datetime with None
, this is exactly what pandas NaT
(NotATime), is for: for missing datetimes.
尝试用 填充日期时间是一个坏主意None
,这正是pandasNaT
(NotATime) 的用途:用于缺少日期时间。