pandas 如何从数据框中弹出行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42285806/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:00:21  来源:igfitidea点击:

How to pop rows from a dataframe?

pythonpandas

提问by user5359531

I found the documentation for pandas.DataFrame.pop, but after trying it and examining the source code, it does not seem to do what I want.

我找到了 的文档pandas.DataFrame.pop,但是在尝试并检查了源代码之后,它似乎没有做我想要的。

If I make a dataframe like this:

如果我制作这样的数据框:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10,6))
# Make a few areas have NaN values
df.iloc[1:3,1] = np.nan
df.iloc[5,3] = np.nan
df.iloc[7:9,5] = np.nan


>>> df
          0         1         2         3         4         5
0  0.772762 -0.442657  1.245988  1.102018 -0.740836  1.685598
1 -0.387922       NaN -1.215723 -0.106875  0.499110  0.338759
2  0.567631       NaN -0.353032 -0.099011 -0.698925 -1.348966
3  1.320849  1.084405 -1.296177  0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964  1.116673  0.392217  1.280808
5 -1.249192 -0.035932 -1.330916       NaN -0.135720 -0.506016
6  0.406344  1.416579  0.122019  0.648851 -0.305359 -1.253580
7 -0.092440 -0.243593  0.468463 -1.689485  0.667804       NaN
8 -0.110819 -0.627777 -0.302116  0.630068  2.567923       NaN
9  1.884069 -0.393420 -0.950275  0.151182 -1.122764  0.502117

If I want to remove selected rows and assign them to a separate object in one step, I would want a popbehavior, like this:

如果我想在一个步骤中删除选定的行并将它们分配给一个单独的对象,我需要一种pop行为,如下所示:

# rows in column 5 which have NaN values
>>> df[df[5].isnull()].index
Int64Index([7, 8], dtype='int64')

# remove them from the dataframe, assign them to a separate object
>>> nan_rows = df.pop(df[df[5].isnull()].index)

However, this does not appear to be supported. Instead, it seems like I am forced to do this in two separate steps, which seems a bit inelegant.

但是,这似乎不受支持。相反,我似乎被迫分两个单独的步骤执行此操作,这似乎有点不雅。

# get the NaN rows
>>> nan_rows = df[df[5].isnull()]

>>> nan_rows
          0         1         2         3         4   5
7 -0.092440 -0.243593  0.468463 -1.689485  0.667804 NaN
8 -0.110819 -0.627777 -0.302116  0.630068  2.567923 NaN

# remove from orignal df
>>> df = df.drop(nan_rows.index)

>>> df
          0         1         2         3         4         5
0  0.772762 -0.442657  1.245988  1.102018 -0.740836  1.685598
1 -0.387922       NaN -1.215723 -0.106875  0.499110  0.338759
2  0.567631       NaN -0.353032 -0.099011 -0.698925 -1.348966
3  1.320849  1.084405 -1.296177  0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964  1.116673  0.392217  1.280808
5 -1.249192 -0.035932 -1.330916       NaN -0.135720 -0.506016
6  0.406344  1.416579  0.122019  0.648851 -0.305359 -1.253580
9  1.884069 -0.393420 -0.950275  0.151182 -1.122764  0.502117

Is there a one-step method built-in? Or is this the way you're 'supposed' to do it?

是否有内置的一步法?或者这是你“应该”这样做的方式?

回答by Boud

pop source code:

弹出源代码:

    def pop(self, item):
        """
        Return item and drop from frame. Raise KeyError if not found.
        """
        result = self[item]
        del self[item]
        try:
            result._reset_cacher()
        except AttributeError:
            pass

        return result
File:      c:\python\lib\site-packages\pandas\core\generic.py
    def pop(self, item):
        """
        Return item and drop from frame. Raise KeyError if not found.
        """
        result = self[item]
        del self[item]
        try:
            result._reset_cacher()
        except AttributeError:
            pass

        return result
File:      c:\python\lib\site-packages\pandas\core\generic.py

deldefinitely won't work if itemis not a simple column name. Pass a simple column name, or do it in two steps.

del如果item不是简单的列名,肯定不会工作。传递一个简单的列名,或者分两步完成。