Python Pandas:字符串包含和不包含

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34055584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:18:47  来源:igfitidea点击:

Python Pandas: String Contains and Doesn't Contain

pythonpandasdataframe

提问by Sam Perry

I'm trying to match rows of a Pandas DataFrame that contains and doesn't contain certain strings. For example:

我正在尝试匹配包含和不包含某些字符串的 Pandas DataFrame 的行。例如:

import pandas
df = pandas.Series(['ab1', 'ab2', 'b2', 'c3'])
df[df.str.contains("b")]

Output:

输出:

0    ab1
1    ab2
2     b2
dtype: object

Desired output:

期望的输出:

2     b2
dtype: object

Question: is there an elegant way of saying something like this?

问题:有没有一种优雅的表达方式?

df[[df.str.contains("b")==True] and [df.str.contains("a")==False]]
# Doesn't give desired outcome

回答by maxymoo

You're almost there, you just haven't got the syntax quite right, it should be:

你快到了,你只是没有完全正确的语法,它应该是:

df[(df.str.contains("b") == True) & (df.str.contains("a") == False)]

Another approach which might be cleaner if you have a lot of conditions to apply would to be to chain your filters together with reduce or a loop:

如果您有很多条件要应用,另一种可能更干净的方法是将过滤器与 reduce 或循环链接在一起:

from functools import reduce
filters = [("a", False), ("b", True)]
reduce(lambda df, f: df[df.str.contains(f[0]) == f[1]], filters, df)
#outputs b2

回答by behzad.nouri

Either:

任何一个:

>>> ts.str.contains('b') & ~ts.str.contains('a')
0    False
1    False
2     True
3    False
dtype: bool

or use regex:

或使用正则表达式:

>>> ts.str.contains('^[^a]*b[^a]*$')
0    False
1    False
2     True
3    False
dtype: bool

回答by lstodd

You can use .loc and ~ to index:

您可以使用 .loc 和 ~ 来索引:

df.loc[(df.str.contains("b")) & (~df.str.contains("a"))]

2    b2
dtype: object