Python Pandas - 在 DataFrame 中的任何位置查找值的索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42386629/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:38:19  来源:igfitidea点击:

Pandas - find index of value anywhere in DataFrame

pythonpandas

提问by Kemeia

I'm new to Python & Pandas.

我是 Python 和 Pandas 的新手。

I want to find the index of a certain value (let's say security_id) in my pandas dataframe, because that is where the columns start. (There is an unknown number of rows with irrelevant data above the columns, as well as a number of empty 'columns' on the left side.)

我想security_id在我的 Pandas 数据框中找到某个值(假设)的索引,因为那是列开始的地方。(列上方有未知数量的行带有无关数据,左侧还有许多空“列”。)

As far as I see, the isinmethod only returns a boolean on whether the value exists, not its index.

据我所知,isin方法只返回一个关于值是否存在的布尔值,而不是它的索引。

How do I find the index of this value?

我如何找到这个值的索引?

采纳答案by Ujjwal

Supposing that your DataFrame looks like the following :

假设您的 DataFrame 如下所示:

      0       1            2      3    4
0     a      er          tfr    sdf   34
1    rt     tyh          fgd    thy  rer
2     1       2            3      4    5
3     6       7            8      9   10
4   dsf     wew  security_id   name  age
5   dfs    bgbf          121  jason   34
6  dddp    gpot         5754   mike   37
7  fpoo  werwrw          342   Hyman   31

Do the following :

请执行下列操作 :

for row in range(df.shape[0]): # df is the DataFrame
         for col in range(df.shape[1]):
             if df.get_value(row,col) == 'security_id':
                 print(row, col)
                 break

回答by Ravishankar Sivasubramaniam

Get the index for rows matching search term in all columns

获取所有列中匹配搜索词的行的索引

search = 'security_id' 
df.loc[df.isin([search]).any(axis=1)].index.tolist()

Rows filtered for matching search term in all columns

筛选行以匹配所有列中的搜索词

search = 'search term' 
df.loc[df.isin([search]).any(axis=1)]

回答by Jay

value you are looking for is not duplicated:

您正在寻找的值不会重复:

poz=matrix[matrix==minv].dropna(axis=1,how='all').dropna(how='all')
value=poz.iloc[0,0]
index=poz.index.item()
column=poz.columns.item()

you can get its index and column

你可以得到它的索引和列

duplicated:

重复:

matrix=pd.DataFrame([[1,1],[1,np.NAN]],index=['q','g'],columns=['f','h'])
matrix
Out[83]: 
   f    h
q  1  1.0
g  1  NaN
poz=matrix[matrix==minv].dropna(axis=1,how='all').dropna(how='all')
index=poz.stack().index.tolist()
index
Out[87]: [('q', 'f'), ('q', 'h'), ('g', 'f')]

you will get a list

你会得到一个清单

回答by Peterd

A oneliner solution avoiding explicit loops...

避免显式循环的单行解决方案......

  • returning the entire row(s)

    df.iloc[np.flatnonzero((df=='security_id').values)//df.shape[1],:]

  • returning row(s) and column(s)

    df.iloc[ np.flatnonzero((df=='security_id').values)//df.shape[1], np.unique(np.flatnonzero((df=='security_id').values)%df.shape[1]) ]

  • 返回整行

    df.iloc[np.flatnonzero((df=='security_id').values)//df.shape[1],:]

  • 返回行和列

    df.iloc[ np.flatnonzero((df=='security_id').values)//df.shape[1], np.unique(np.flatnonzero((df=='security_id').values)%df.形状[1]) ]

回答by Adam Slack

I think this question may have been asked before here. The accepted answer is pretty comprehensive and should help you find the index of a value in a column.

我想这个问题之前可能已经问过这里了。接受的答案非常全面,应该可以帮助您找到列中值的索引。

Edit: if the column that the value exists in is not known, then you could use:

编辑:如果值所在的列未知,那么您可以使用:

for col in df.columns:
    df[df[col] == 'security_id'].index.tolist()

回答by tasos bada

Function finds the positions of a value in a dataframe

函数在数据框中查找值的位置

import pandas as pd
import numpy as np

def pandasFindPositionsInDataframe(dfIn,findme):
    positions = []
    irow =0
    while ( irow < len(dfIn.index)):
        list_colPositions=dfIn.columns[dfIn.iloc[irow,:]==findme].tolist()   
        if list_colPositions != []:
            colu_iloc = dfIn.columns.get_loc(list_colPositions[0])
            positions.append([irow, colu_iloc])
        irow +=1

    return positions