Python 基于 Pandas 数据框中的多列值选择行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29219011/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
selecting rows based on multiple column values in pandas dataframe
提问by Ssank
I have a pandas
DataFrame
df
:
我有一个pandas
DataFrame
df
:
import pandas as pd
data = {"Name": ["AAAA", "BBBB"],
"C1": [25, 12],
"C2": [2, 1],
"C3": [1, 10]}
df = pd.DataFrame(data)
df.set_index("Name")
which looks like this when printed (for reference):
打印时看起来像这样(供参考):
C1 C2 C3
Name
AAAA 25 2 1
BBBB 12 1 10
I would like to choose rows for which C1
, C2
and C3
have values between 0
and 20
.
我想为 which 选择行C1
,C2
并且C3
在0
和之间具有值20
。
Can you suggest an elegant way to select those rows?
你能建议一种优雅的方式来选择这些行吗?
采纳答案by kennes
I think below should do it, but its elegance is up for debate.
我认为下面应该这样做,但它的优雅有待商榷。
new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
回答by EdChum
Shorter version:
较短的版本:
In [65]:
df[(df>=0)&(df<=20)].dropna()
Out[65]:
Name C1 C2 C3
1 BBBB 12 1 10
回答by Rob Buckley
I like to use df.query() for these kind of things
我喜欢用 df.query() 做这些事情
df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')
回答by braham-snyder
df.query(
"0 < C1 < 20 and 0 < C2 < 20 and 0 < C3 < 20"
)
or
或者
df.query("0 < @df < 20").dropna()
Using @foo
in df.query
refers to the variable foo
in the environment.
使用@foo
indf.query
是指foo
环境中的变量。