在 Pandas 中选择不包含特定字符的行

Question

提问by Arnold Klein

I need something similar to

我需要类似的东西

.str.startswith() 
.str.endswith()

but for the middle part of a string.

但是对于字符串的中间部分。

For example, given the following pd.DataFrame

例如，给定以下 pd.DataFrame

      str_name
   0    aaabaa
   1    aabbcb
   2    baabba
   3    aacbba
   4    baccaa
   5    ababaa

I need to throw rows 1, 3 and 4 which contain (at least one) letter 'c'.
The position of the specific letter ('c') is not known.
The task is to remove all rows which do contain at least one specific letter

我需要抛出包含（至少一个）字母“c”的第 1、3 和 4 行。
特定字母 ('c') 的位置未知。
任务是删除所有包含至少一个特定字母的行

Answer 1

回答by juanpa.arrivillaga

You want df['string_column'].str.contains('c')

你要 df['string_column'].str.contains('c')

>>> df
  str_name
0   aaabaa
1   aabbcb
2   baabba
3   aacbba
4   baccaa
5   ababaa
>>> df['str_name'].str.contains('c')
0    False
1     True
2    False
3     True
4     True
5    False
Name: str_name, dtype: bool

Now, you can "delete" like this

现在，您可以像这样“删除”

>>> df = df[~df['str_name'].str.contains('c')]
>>> df
  str_name
0   aaabaa
2   baabba
5   ababaa
>>>

Edited to add:

编辑添加：

If you only want to check the first kcharacters, you can slice. Suppose k=3:

如果您只想检查第一个k字符，则可以slice. 假设k=3：

>>> df.str_name.str.slice(0,3)
0    aaa
1    aab
2    baa
3    aac
4    bac
5    aba
Name: str_name, dtype: object
>>> df.str_name.str.slice(0,3).str.contains('c')
0    False
1    False
2    False
3     True
4     True
5    False
Name: str_name, dtype: bool

Note, Series.str.slicedoes not behave like a typical Python slice.

注意，Series.str.slice它的行为不像典型的 Python 切片。

Answer 2

回答by piRSquared

you can use numpy

您可以使用 numpy

df[np.core.chararray.find(df.str_name.values.astype(str), 'c') < 0]

  str_name
0   aaabaa
2   baabba
5   ababaa

Answer 3

回答by Vaishali

You can use str.contains()

您可以使用 str.contains()

str_name = pd.Series(['aaabaa', 'aabbcb', 'baabba', 'aacbba',  'baccaa','ababaa'])
str_name.str.contains('c')

This will return the boolean

这将返回布尔值

The following will return the inverse of the above

以下将返回上述的倒数

~str_name.str.contains('c')

在 Pandas 中选择不包含特定字符的行

提问by Arnold Klein

回答by juanpa.arrivillaga

回答by piRSquared

回答by Vaishali

相关推荐

最近更新

标签

在 Pandas 中选择不包含特定字符的行

提问by Arnold Klein

回答by juanpa.arrivillaga

回答by piRSquared

回答by Vaishali

相关推荐

pandas Python - 如何从 excel 列创建列表

在 Pandas 中，在 groupby 之后分组列消失了

pandas 如何调整seaborn中的子图大小？

pandas 带有熊猫和 Jupyter 笔记本的交互式箱线图

相关推荐

最近更新

标签