pandas 熊猫子集并根据列值删除行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37912487/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:25:14  来源:igfitidea点击:

pandas subset and drop rows based on column value

pythonpandas

提问by ramesh

my df:

我的 df:

dframe = pd.DataFrame({"A":list("aaaabbbbccc"), "C":range(1,12)},  index=range(1,12))

Out[9]: 
    A   C
1   a   1
2   a   2
3   a   3
4   a   4
5   b   5
6   b   6
7   b   7
8   b   8
9   c   9
10  c  10
11  c  11

to subset based on column value:

根据列值进行子集化:

In[11]: first = dframe.loc[dframe["A"] == 'a']
In[12]: first
Out[12]: 
   A  C
1  a  1
2  a  2
3  a  3
4  a  4

To drop based on column value:

根据列值删除:

In[16]: dframe = dframe[dframe["A"] != 'a']
In[17]: dframe
Out[16]: 
    A   C
5   b   5
6   b   6
7   b   7
8   b   8
9   c   9
10  c  10
11  c  11

Is there any way to do both in one shot? Like subsetting rows based on a column value and deleting same rows in the original df.

有没有办法一举两得?就像根据列值对行进行子集化并删除原始 df 中的相同行。

回答by chrisb

It's not really in one shot, but typically the way to do this is reuse a boolean mask, like this:

这不是一次拍摄,但通常这样做的方法是重用布尔掩码,如下所示:

In [28]: mask = dframe['A'] == 'a'

In [29]: first, dframe = dframe[mask], dframe[~mask]

In [30]: first
Out[30]:
   A  C
1  a  1
2  a  2
3  a  3
4  a  4

In [31]: dframe
Out[31]:
    A   C
5   b   5
6   b   6
7   b   7
8   b   8
9   c   9
10  c  10
11  c  11

回答by Joe T. Boka

You can also use drop()

你也可以使用drop()

dframe = dframe.drop(dframe.index[dframe.A == 'a'])

Output:

输出:

    A   C
5   b   5
6   b   6
7   b   7
8   b   8
9   c   9
10  c   10
11  c   11

If you want to fix the index, you can do this.

如果你想修复index,你可以这样做。

dframe.index = range(len(dframe))

Output:

输出:

    A   C
0   b   5
1   b   6
2   b   7
3   b   8
4   c   9
5   c   10
6   c   11

回答by piRSquared

An alternate way to think about it.

另一种思考方式。

gb = dframe.groupby(dframe.A == 'a')
isa, nota = gb.get_group(True), gb.get_group(False)