Python 基于 Pandas 数据框中的多列值选择行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29219011/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:13:57  来源:igfitidea点击:

selecting rows based on multiple column values in pandas dataframe

pythonpandas

提问by Ssank

I have a pandasDataFramedf:

我有一个pandasDataFramedf

import pandas as pd

data = {"Name": ["AAAA", "BBBB"],
        "C1": [25, 12],
        "C2": [2, 1],
        "C3": [1, 10]}

df = pd.DataFrame(data)
df.set_index("Name")

which looks like this when printed (for reference):

打印时看起来像这样(供参考):

      C1  C2  C3
Name            
AAAA  25   2   1
BBBB  12   1  10

I would like to choose rows for which C1, C2and C3have values between 0and 20.

我想为 which 选择行C1C2并且C30和之间具有值20

Can you suggest an elegant way to select those rows?

你能建议一种优雅的方式来选择这些行吗?

采纳答案by kennes

I think below should do it, but its elegance is up for debate.

我认为下面应该这样做,但它的优雅有待商榷。

new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]

回答by EdChum

Shorter version:

较短的版本:

In [65]:

df[(df>=0)&(df<=20)].dropna()
Out[65]:
   Name  C1  C2  C3
1  BBBB  12   1  10

回答by Rob Buckley

I like to use df.query() for these kind of things

我喜欢用 df.query() 做这些事情

df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')

回答by braham-snyder

df.query(
    "0 < C1 < 20 and 0 < C2 < 20 and 0 < C3 < 20"
)

or

或者

df.query("0 < @df < 20").dropna()

Using @fooin df.queryrefers to the variable fooin the environment.

使用@fooindf.query是指foo环境中的变量。