pandas 根据熊猫中多列中的值从数据框中选择行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31756340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Selecting rows from a Dataframe based on values in multiple columns in pandas
提问by Shane
This question is veryrelated to another, and I'll even use the example from the very helpful accepted solution on that question. Here's the example from the accepted solution (credit to unutbu):
这个问题与另一个非常相关,我什至会使用这个问题的非常有用的公认解决方案中的示例。这是已接受的解决方案中的示例(归功于 unutbu):
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
# A B C D
# 0 foo one 0 0
# 1 bar one 1 2
# 2 foo two 2 4
# 3 bar three 3 6
# 4 foo two 4 8
# 5 bar two 5 10
# 6 foo one 6 12
# 7 foo three 7 14
print(df.loc[df['A'] == 'foo'])
yields
产量
A B C D
0 foo one 0 0
2 foo two 2 4
4 foo two 4 8
6 foo one 6 12
7 foo three 7 14
But what if I want to pick out all rows that include both 'foo' and 'one'? Here that would be row 0 and 6. My attempt at it is to try
但是如果我想挑选出所有包含 'foo' 和 'one' 的行怎么办?这将是第 0 行和第 6 行。我的尝试是尝试
print(df.loc[df['A'] == 'foo' and df['B'] == 'one'])
This does not work, unfortunately. Can anybody suggest a way to implement something like this? Ideally it would be general enough that there could be a more complex set of conditions in there involving andand or, though I don't actually need that for my purposes.
不幸的是,这不起作用。任何人都可以建议一种方法来实现这样的事情吗?理想情况下,它足够通用,可能有一组更复杂的条件涉及andand or,尽管我实际上并不需要它来实现我的目的。
回答by joris
There is only a very small change needed in your code: change the andwith &(and add parentheses for correct ordering of comparisons):
您的代码中只需要进行很小的更改:更改andwith &(并添加括号以正确排序比较):
In [104]: df.loc[(df['A'] == 'foo') & (df['B'] == 'one')]
Out[104]:
A B C D
0 foo one 0 0
6 foo one 6 12
The reason you have to use &is that this will do the comparison element-wise on arrays, while andexpect to compare two expressions that evaluate to True or False.
Similarly, when you want the orcomparison, you can use |in this case.
您必须使用的原因&是这将在数组上按元素and进行比较,同时期望比较计算为 True 或 False 的两个表达式。
同样,当你想要or比较时,你可以|在这种情况下使用。
回答by Geeocode
You can do this with tiny altering in your code:
您可以通过对代码进行微小更改来做到这一点:
print(df[df['A'] == 'foo'][df['B'] == 'one'])
Output:
输出:
A B C D
0 foo one 0 0
6 foo one 6 12

