列子集和过滤 Pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32908038/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:58:00  来源:igfitidea点击:

Subset of columns and filter Pandas

pythonpandas

提问by Simon

Using Pandas how would I filter rows and take just a subset of columns from a pandas dataframe please in one command.

使用 Pandas,我将如何在一个命令中过滤行并仅从 Pandas 数据框中获取列的子集。

I am trying to apply something like this....

我正在尝试应用这样的东西......

frame[(frame.DESIGN_VALUE > 20) & (frame['mycol3','mycol6']))]

Thanks.

谢谢。

回答by EdChum

You can use the boolean condition to generate a mask and pass a list of cols of interest using loc:

您可以使用布尔条件生成掩码并使用以下方法传递感兴趣的列的列表loc

frame.loc[frame['DESIGN_VALUE'] > 20,['mycol3', 'mycol6']]

I advise the above because it means you operate on a view not a copy, secondly I also stronglysuggest using []to select your columns rather than as attributes via sot .operator, this avoids ambiguities in pandas behaviour

我建议上述内容,因为这意味着您操作的是视图而不是副本,其次我还强烈建议使用[]选择列而不是通过 sot.运算符作为属性,这可以避免Pandas行为中的歧义

Example:

例子:

In [184]:
df = pd.DataFrame(columns = list('abc'), data = np.random.randn(5,3))
df

Out[184]:
          a         b         c
0 -0.628354  0.833663  0.658212
1  0.032443  1.062135 -0.335318
2 -0.450620 -0.906486  0.015565
3  0.280459 -0.375468 -1.603993
4  0.463750 -0.638107 -1.598261

In [187]:
df.loc[df['a']>0, ['b','c']]

Out[187]:
          b         c
1  1.062135 -0.335318
3 -0.375468 -1.603993
4 -0.638107 -1.598261

This:

这个:

frame[(frame.DESIGN_VALUE > 20) & (frame['mycol3','mycol6'])]

Won't work as you're trying to sub-select from your df as a condition by including it using &

当您尝试从 df 中子选择作为条件时,将无法使用 &

回答by Prasanna Raj

This won't work because extra one ')' parentheses present in

这将不起作用,因为在

frame[(frame.DESIGN_VALUE > 20) & (frame['mycol3','mycol6']))]

Actual result is:

实际结果是:

frame[(frame.DESIGN_VALUE > 20) & (frame['mycol3','mycol6'])]