像 SQL 的 LIKE 一样匹配 Pandas 文本?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22291565/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas text matching like SQL's LIKE?
提问by naught101
Is there a way to do something similar to SQL's LIKE syntaxon a pandas text DataFrame column, such that it returns a list of indices, or a list of booleans that can be used for indexing the dataframe? For example, I would like to be able to match all rows where the column starts with 'prefix_', similar to WHERE <col> LIKE prefix_%
in SQL.
有没有办法在 Pandas 文本 DataFrame 列上执行类似于SQL 的 LIKE 语法的操作,以便它返回索引列表或可用于索引数据帧的布尔值列表?例如,我希望能够匹配列以“prefix_”开头的所有行,类似于WHERE <col> LIKE prefix_%
SQL。
回答by Andy Hayden
You can use the Series method str.startswith
(which takes a regex):
您可以使用 Series 方法str.startswith
(采用正则表达式):
In [11]: s = pd.Series(['aa', 'ab', 'ca', np.nan])
In [12]: s.str.startswith('a', na=False)
Out[12]:
0 True
1 True
2 False
3 False
dtype: bool
You can also do the same with str.contains
(using a regex):
你也可以用str.contains
(使用正则表达式)做同样的事情:
In [13]: s.str.contains('^a', na=False)
Out[13]:
0 True
1 True
2 False
3 False
dtype: bool
So you can do df[col].str.startswith
...
所以你可以做df[col].str.startswith
...
See also the SQL comparison section of the docs.
Note: (as pointed out by OP) by default NaNs will propagate (and hence cause an indexing error if you want to use the result as a boolean mask), we use this flag to say that NaN should map to False.
注意:(正如 OP 所指出的)默认情况下 NaN 将传播(如果您想将结果用作布尔掩码,因此会导致索引错误),我们使用此标志表示 NaN 应该映射到 False。
In [14]: s.str.startswith('a') # can't use as boolean mask
Out[14]:
0 True
1 True
2 False
3 NaN
dtype: object
回答by sushmit
you can use
您可以使用
s.str.contains('a', case = False)
回答by H Raihan
- To find all the values from the series that starts with a pattern "s":
- 要从以模式“s”开头的系列中查找所有值:
SQL - WHERE column_name LIKE 's%'
Python - column_name.str.startswith('s')
SQL - WHERE column_name LIKE 's%'
Python - column_name.str.startswith('s')
- To find all the values from the series that ends with a pattern "s":
- 要从以模式“s”结尾的系列中查找所有值:
SQL - WHERE column_name LIKE '%s'
Python - column_name.str.endswith('s')
SQL - WHERE column_name LIKE '%s'
Python - column_name.str.endswith('s')
- To find all the values from the series that contains pattern "s":
- 要从包含模式“s”的系列中查找所有值:
SQL - WHERE column_name LIKE '%s%'
Python - column_name.str.contains('s')
SQL - WHERE column_name LIKE '%s%'
Python - column_name.str.contains('s')
For more options, check : https://pandas.pydata.org/pandas-docs/stable/reference/series.html
有关更多选项,请检查:https: //pandas.pydata.org/pandas-docs/stable/reference/series.html