pandas 如何在熊猫中用空列表[]填充数据框Nan值？

Question

提问by ALH

This is my dataframe:

这是我的数据框：

          date                          ids
0     2011-04-23  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
1     2011-04-24  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
2     2011-04-25  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
3     2011-04-26  Nan
4     2011-04-27  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
5     2011-04-28  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...

I want to replace Nanwith []. How to do that? Fillna([]) did not work. I even tried replace(np.nan, [])but it gives error:

我想Nan用[]替换。怎么做？Fillna([]) 不起作用。我什至尝试过，replace(np.nan, [])但它给出了错误：

 TypeError('Invalid "to_replace" type: \'float\'',)

Answer 1

采纳答案by Alexander

You can first use locto locate all rows that have a nanin the idscolumn, and then loop through these rows using atto set their values to an empty list:

你可以先使用loc以找出有所有行nan的ids列，然后通过使用这些行循环at到它们的值设置为空列表：

for row in df.loc[df.ids.isnull(), 'ids'].index:
    df.at[row, 'ids'] = []

>>> df
        date                                             ids
0 2011-04-23  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
1 2011-04-24  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
2 2011-04-25  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
3 2011-04-26                                              []
4 2011-04-27  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
5 2011-04-28  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

Answer 2

回答by Nick Edgar

My approach is similar to @hellpanderrr's, but instead tests for list-ness rather than using isnan:

我的方法类似于@hellpanderrr 的方法，但是测试列表性而不是使用isnan：

df['ids'] = df['ids'].apply(lambda d: d if isinstance(d, list) else [])

I originally tried using pd.isnull(or pd.notnull) but, when given a list, that returns the null-ness of each element.

我最初尝试使用pd.isnull(or pd.notnull) 但是，当给定一个列表时，它返回每个元素的空性。

Answer 3

回答by PlasmaBinturong

After a lot of head-scratching I found this method that should be the most efficient (no looping, no apply), just assigning to a slice:

经过大量的挠头后，我发现这种方法应该是最有效的（没有循环，没有应用），只需分配给一个切片：

isnull = df.ids.isnull()

df.loc[isnull, 'ids'] = [ [[]] * isnull.sum() ]

The trick was to construct your list of []of the right size (isnull.sum()), and thenenclose it in a list: the value you are assigning is a 2Darray (1 column, isnull.sum()rows) containing empty lists as elements.

诀窍是构建[]正确大小 ( isnull.sum())的列表，然后将其包含在一个列表中：您分配的值是一个包含空列表作为元素的二维数组（1 列，isnull.sum()行）。

Answer 4

回答by hellpanderr

Without assignments:

没有任务：

1) Assuming we have only floats and integers in our dataframe

1）假设我们的数据框中只有浮点数和整数

import math
df.apply(lambda x:x.apply(lambda x:[] if math.isnan(x) else x))

2) For any dataframe

2）对于任何数据帧

import math
def isnan(x):
    if isinstance(x, (int, long, float, complex)) and math.isnan(x):
        return True

df.apply(lambda x:x.apply(lambda x:[] if isnan(x) else x))

Answer 5

回答by Allen

Another solution using numpy:

使用 numpy 的另一种解决方案：

df.ids = np.where(df.ids.isnull(), pd.Series([[]]*len(df)), df.ids)

Or using combine_first:

或者使用 combine_first：

df.ids = df.ids.combine_first(pd.Series([[]]*len(df)))

Answer 6

回答by botivegh

This is probably faster, one liner solution:

这可能更快，一种班轮解决方案：

df['ids'].fillna('DELETE').apply(lambda x : [] if x=='DELETE' else x)

Answer 7

回答by keramat

Maybe more dense:

也许更密集：

df['ids'] = [[] if type(x) != list else x for x in df['ids']]

Answer 8

回答by TICH

Create a function that checks your condition, if not, it returns an empty list/empty set etc.

创建一个函数来检查你的条件，如果没有，它返回一个空列表/空集等。

Then apply that function to the variable, but also assigning the new calculated variable to the old one or to a new variable if you wish.

然后将该函数应用于变量，但也可以根据需要将新计算的变量分配给旧变量或新变量。

aa=pd.DataFrame({'d':[1,1,2,3,3,np.NaN],'r':[3,5,5,5,5,'e']})


def check_condition(x):
    if x>0:
        return x
    else:
        return list()

aa['d]=aa.d.apply(lambda x:check_condition(x))

pandas 如何在熊猫中用空列表[]填充数据框Nan值？

提问by ALH

采纳答案by Alexander

回答by Nick Edgar

回答by PlasmaBinturong

回答by hellpanderr

回答by Allen

回答by botivegh

回答by keramat

回答by TICH

相关推荐

最近更新

标签

pandas 如何在熊猫中用空列表[]填充数据框Nan值？

提问by ALH

采纳答案by Alexander

回答by Nick Edgar

回答by PlasmaBinturong

回答by hellpanderr

回答by Allen

回答by botivegh

回答by keramat

回答by TICH

相关推荐

从一个函数在 Pandas Dataframe 中创建多列

pandas ValueError：不支持连续

Pandas Dataframe：用行平均值替换 NaN

pandas 删除熊猫数据框中每一行的标点符号

相关推荐

最近更新

标签