在 Pandas DataFrame 中查找字符串值的索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46453275/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:32:24  来源:igfitidea点击:

Find the index of a string value in a pandas DataFrame

pythonpandas

提问by Ben

How can I identify which column(s) in my DataFrame contain a specific string 'foo'?

如何识别 DataFrame 中的哪些列包含特定字符串'foo'

Sample DataFrame:

示例数据帧:

>>> import pandas as pd
>>> df = pd.DataFrame({'A':[10,20,42], 'B':['foo','bar','blah'],'C':[3,4,5], 'D':['some','foo','thing']})

I want to find Band Dhere.

我想找到BD这里。

I can search for numbers:

我可以搜索数字:

If I'm looking for a number (e.g. 42) instead of a string, I can generate a boolean mask like this:

如果我正在寻找一个数字(例如 42)而不是一个字符串,我可以生成一个布尔掩码,如下所示:

>>> ~(df.where(df==42)).isnull().all()

A     True
B    False
C    False
D    False
dtype: bool

but not strings:

但不是字符串:

>>> ~(df.where(df=='foo')).isnull().all()

TypeError: Could not compare ['foo'] with block values

I don't want to iterate over each column and row if possible (my actual data is much larger than this example). It feels like there should be a simple and efficient way.

如果可能的话,我不想遍历每一列和每一行(我的实际数据比这个例子大得多)。感觉应该有一个简单有效的方法。

How can I do this?

我怎样才能做到这一点?

采纳答案by Divakar

One way with underlying array data -

底层数组数据的一种方式 -

df.columns[(df.values=='foo').any(0)].tolist()

Sample run -

样品运行 -

In [209]: df
Out[209]: 
    A     B  C      D
0  10   foo  3   some
1  20   bar  4    foo
2  42  blah  5  thing

In [210]: df.columns[(df.values=='foo').any(0)].tolist()
Out[210]: ['B', 'D']

If you are looking for just the column-mask -

如果您只是在寻找列掩码 -

In [205]: (df.values=='foo').any(0)
Out[205]: array([False,  True, False,  True], dtype=bool)

回答by YOBEN_S

Option 1 df.values

选项1 df.values

~(df.where(df.values=='foo')).isnull().all()

Out[91]: 
A    False
B     True
C    False
D     True
dtype: bool

Option 2 isin

选项 2 isin

~(df.where(df.isin(['foo']))).isnull().all()
Out[94]: 
A    False
B     True
C    False
D     True
dtype: bool

回答by rko

Unfortunately, it won't index a str through the syntax you gave. It has to be run as a series of type string to compare it with string, unless I am missing something.

不幸的是,它不会通过您提供的语法索引 str 。它必须作为一系列类型的字符串运行才能与字符串进行比较,除非我遗漏了一些东西。

try this

尝试这个

~df101.where(df101.isin(['foo'])).isnull().all()
A    False
B     True
C    False
D     True
dtype: bool