pandas 根据熊猫中多列中的值从数据框中选择行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31756340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:42:58  来源:igfitidea点击:

Selecting rows from a Dataframe based on values in multiple columns in pandas

pythonpandas

提问by Shane

This question is veryrelated to another, and I'll even use the example from the very helpful accepted solution on that question. Here's the example from the accepted solution (credit to unutbu):

这个问题与另一个非常相关,我什至会使用这个问题的非常有用的公认解决方案中的示例。这是已接受的解决方案中的示例(归功于 unutbu):

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
#      A      B  C   D
# 0  foo    one  0   0
# 1  bar    one  1   2
# 2  foo    two  2   4
# 3  bar  three  3   6
# 4  foo    two  4   8
# 5  bar    two  5  10
# 6  foo    one  6  12
# 7  foo  three  7  14

print(df.loc[df['A'] == 'foo'])

yields

产量

     A      B  C   D
0  foo    one  0   0
2  foo    two  2   4
4  foo    two  4   8
6  foo    one  6  12
7  foo  three  7  14

But what if I want to pick out all rows that include both 'foo' and 'one'? Here that would be row 0 and 6. My attempt at it is to try

但是如果我想挑选出所有包含 'foo' 和 'one' 的行怎么办?这将是第 0 行和第 6 行。我的尝试是尝试

print(df.loc[df['A'] == 'foo' and df['B'] == 'one'])

This does not work, unfortunately. Can anybody suggest a way to implement something like this? Ideally it would be general enough that there could be a more complex set of conditions in there involving andand or, though I don't actually need that for my purposes.

不幸的是,这不起作用。任何人都可以建议一种方法来实现这样的事情吗?理想情况下,它足够通用,可能有一组更复杂的条件涉及andand or,尽管我实际上并不需要它来实现我的目的。

回答by joris

There is only a very small change needed in your code: change the andwith &(and add parentheses for correct ordering of comparisons):

您的代码中只需要进行很小的更改:更改andwith &(并添加括号以正确排序比较):

In [104]: df.loc[(df['A'] == 'foo') & (df['B'] == 'one')]
Out[104]:
     A    B  C   D
0  foo  one  0   0
6  foo  one  6  12

The reason you have to use &is that this will do the comparison element-wise on arrays, while andexpect to compare two expressions that evaluate to True or False.
Similarly, when you want the orcomparison, you can use |in this case.

您必须使用的原因&是这将在数组上按元素and进行比较,同时期望比较计算为 True 或 False 的两个表达式。
同样,当你想要or比较时,你可以|在这种情况下使用。

回答by Geeocode

You can do this with tiny altering in your code:

您可以通过对代码进行微小更改来做到这一点:

print(df[df['A'] == 'foo'][df['B'] == 'one'])

Output:

输出:

     A    B  C   D
0  foo  one  0   0
6  foo  one  6  12