Python/Pandas 返回找到的字符串的列和行索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45042005/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:58:33  来源:igfitidea点击:

Python/Pandas return column and row index of found string

python-3.xpandas

提问by codeninja

I've searched previous answers relating to this but those answers seem to utilize numpy because the array contains numbers. I am trying to search for a keyword in a sentence in a dataframe ('Timeframe') where the full sentence is 'Timeframe for wave in ____' and would like to return the column and row index. For example:

我已经搜索了与此相关的先前答案,但这些答案似乎使用了 numpy,因为数组包含数字。我正在尝试在数据帧(“时间帧”)中的句子中搜索关键字,其中完整的句子是“____中的波浪时间帧”,并希望返回列和行索引。例如:

    df.iloc[34,0] 

returns the string I am looking for but I am avoiding a hard code for dynamic reasons. Is there a way to return the [34,0] when I search the dataframe for the keyword 'Timeframe'

返回我正在寻找的字符串,但由于动态原因我避免使用硬代码。当我在数据帧中搜索关键字“Timeframe”时,有没有办法返回 [34,0]

回答by jezrael

EDIT:

编辑:

For check index need containswith boolean indexing, but then there are possible 3 values:

对于检查索引需要contains使用boolean indexing,但可能有 3 个值:

df = pd.DataFrame({'A':['Timeframe for wave in ____', 'a', 'c']})
print (df)
                            A
0  Timeframe for wave in ____
1                           a
2                           c



def check(val):
    a = df.index[df['A'].str.contains(val)]
    if a.empty:
        return 'not found'
    elif len(a) > 1:
        return a.tolist()
    else:
        #only one value - return scalar  
        return a.item()
print (check('Timeframe'))
0

print (check('a'))
[0, 1]

print (check('rr'))
not found

Old solution:

旧解决方案:

It seems you need if need numpy.wherefor check value Timeframe:

如果需要numpy.where检查值,您似乎需要Timeframe

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,'Timeframe'],
                   'D':[1,3,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'F':list('aaabbb')})

print (df)
   A  B          C  D  E  F
0  a  4          7  1  5  a
1  b  5          8  3  3  a
2  c  4          9  5  6  a
3  d  5          4  7  9  b
4  e  5          2  1  2  b
5  f  4  Timeframe  0  4  b


a = np.where(df.values == 'Timeframe')
print (a)
(array([5], dtype=int64), array([2], dtype=int64))

b = [x[0] for x in a]
print (b)
[5, 2]

回答by Cedric Zoppolo

In case you have multiple columns where to look into you can use following code example:

如果您有多个要查看的列,可以使用以下代码示例:

import numpy as np
import pandas as pd
df = pd.DataFrame([[1,2,3,4],["a","b","Timeframe for wave in____","d"],[5,6,7,8]])
mask = np.column_stack([df[col].str.contains("Timeframe", na=False) for col in df])
find_result = np.where(mask==True)
result = [find_result[0][0], find_result[1][0]]

Then output for dfand resultwould be:

然后输出dfandresult将是:

>>> df
   0  1                          2  3
0  1  2                          3  4
1  a  b  Timeframe for wave in____  d
2  5  6                          7  8
>>> result
[1, 2]