pandas 在熊猫中过滤 - 如何应用自定义方法（lambda）？

Question

提问by アレックス

I have a DataFrame where one of the columns contains an string which contains words delimited by comma.

我有一个 DataFrame，其中一列包含一个字符串，其中包含以逗号分隔的单词。

>>> df['column1']
# ....
996                  str1, str2, str3
997                  str4, str5, str7
998                  str8, str9, str10
# ...........

I need to treat the content of that column as an array of string so I can do this:

我需要将该列的内容视为字符串数组，以便我可以这样做：

 [
  # ..... 
  & (df['column1'].isin('str2')) # should return the row #996
  # ....
 ]

I tried this but it hasn't panned out, of course:

我试过这个，但它没有成功，当然：

 [
  # ..... 
  & (df['column1'].split(',').isin('str2'))
  # ....
 ]

How can I do that? Or rather how can I use a method (lambda) to modify the content of the column before filtering?

我怎样才能做到这一点？或者更确切地说，如何在过滤之前使用方法 (lambda) 修改列的内容？

UPDATE1:

更新1：

This is a part of my code:

这是我的代码的一部分：

for x in pd.read_csv.....
      df_item = x

      if filter1:
        df_item = df_item[(df_item['column1'] == filter1)]

      if filter2:
        df_item = df_item[(df_item['column2'].isin(subjects))]

      # .....

How can I apply df['column2'].apply(lambda x: 'str2' in x.split(','))to

我怎样才能申请df['column2'].apply(lambda x: 'str2' in x.split(','))到

  if filter2:
    df_item = df_item[(df_item['column2'].isin(subjects))]

Answer 1

回答by Anand S Kumar

isinchecks whether the value from the series is in the iterable (in your case 'str2') . Not whether str2is contained in your series' value.

isin检查系列中的值是否在可迭代中（在您的情况下'str2'）。不是是否str2包含在您的系列值中。

If your series contains strings, then a method to get what you want would be to use .str.contains()to check whether the string contains str2. Example -

如果您的系列包含字符串，那么获取您想要的内容的方法将.str.contains()用于检查字符串是否包含str2. 例子 -

df['column1'].str.contains('str2')

If you must split the contents use ','(that is if str2can be a substring of any of the other strings) . You can use Series.apply. Example -

如果您必须拆分内容使用','（即 ifstr2可以是任何其他字符串的子字符串）。您可以使用Series.apply. 例子 -

df['column1'].apply(lambda x: 'str2' in x.split(','))

To apply this, simply use this to filter the DataFrame. Example -

要应用它，只需使用它来过滤 DataFrame。例子 -

if <somefilter>:
    df_item = df_item[df_item['column2'].apply(lambda x: 'str2' in x.split(','))]

pandas 在熊猫中过滤 - 如何应用自定义方法（lambda）？

提问by アレックス

回答by Anand S Kumar

相关推荐

最近更新

标签

pandas 在熊猫中过滤 - 如何应用自定义方法（lambda）？

提问by アレックス

回答by Anand S Kumar

相关推荐

pandas 根据不同列中的值复制行

pandas 熊猫如何将所有字符串值转换为浮点数

pandas 使用python pandas将一列拆分为多列

Pandas - 手动创建数据框并插入值

相关推荐

最近更新

标签