Python pandas:获取数据框中值的位置
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28979794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas: Getting the locations of a value in dataframe
提问by hlin117
Suppose I have the following dataframe:
假设我有以下数据框:
'a' 'b'
0 0 0
1 1 0
2 0 1
3 0 1
Is there a way I could get the index/column values for which a specific value exists? For example, something akin to the following:
有没有办法获得存在特定值的索引/列值?例如,类似于以下内容:
values = df.search(1)
would have values = [(1, 'a'), (2, 'b'), (3, 'b')].
会有values = [(1, 'a'), (2, 'b'), (3, 'b')]。
采纳答案by ely
If you don't mind working with a NumPy array with the first column expressing the index location and the second column expressing the indexof the column name as it resides in df.columns, then it's very short:
如果您不介意使用 NumPy 数组,其中第一列表示索引位置,第二列表示列名的索引,因为它驻留在 中df.columns,那么它非常短:
In [11]: np.argwhere(df)
Out[11]:
array([[1, 0],
[2, 1],
[3, 1]])
If you want to format this into the list of tuples with actual column names, you can further do:
如果要将其格式化为具有实际列名的元组列表,您可以进一步执行以下操作:
In [12]: [(x, df.columns[y]) for x,y in np.argwhere(df)]
Out[12]: [(1, 'a'), (2, 'b'), (3, 'b')]
You can use this same approach with logical expressions inside of np.argwhere, so for example say you have this DataFrame of some random data:
您可以对 内部的逻辑表达式使用相同的方法,np.argwhere例如,假设您有一些随机数据的 DataFrame:
In [13]: dfrm
Out[13]:
A B C
0 0.382531 0.287066 0.345749
1 0.725201 0.450656 0.336720
2 0.146883 0.266518 0.011339
3 0.111154 0.190367 0.275750
4 0.757144 0.283361 0.736129
5 0.039405 0.643290 0.383777
6 0.632230 0.434664 0.094089
7 0.658512 0.368150 0.433340
8 0.062180 0.523572 0.505400
9 0.287539 0.899436 0.194938
[10 rows x 3 columns]
Then you could do this for example:
然后你可以这样做,例如:
In [14]: [(x, dfrm.columns[y]) for x,y in np.argwhere(dfrm > 0.8)]
Out[14]: [(9, 'B')]
As a search function, it can be defined like this:
作为一个搜索函数,它可以这样定义:
def search(df, df_condition):
return [(x, df.columns[y]) for x,y in np.argwhere(df_condition(df))]
For example:
例如:
In [17]: search(dfrm, lambda x: x > 0.8)
Out[17]: [(9, 'B')]
In [18]: search(df, lambda x: x == 1)
Out[18]: [(1, 'a'), (2, 'b'), (3, 'b')]
回答by Alex
df[df == 1].stack().index.tolist()
yields
产量
[(1, 'a'), (2, 'b'), (3, 'b')]
回答by Liam Foley
use pd.melt + some other munging.
使用 pd.melt + 其他一些 munging。
import pandas as pd
df = pd.DataFrame({'a':[0,1,0,0],
'b':[0,0,1,1]})
df1 = pd.melt(df.reset_index(),id_vars=['index'])
df1 = df1[df1['value'] == 1]
locations = zip(df1['index'],df1['variable'])
Output:
输出:
[(1, 'a'), (2, 'b'), (3, 'b')]

