Python 如果包含熊猫中的子字符串,则替换整个字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39768547/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:41:01  来源:igfitidea点击:

Replace whole string if it contains substring in pandas

pythonpandas

提问by nicofilliol

I want to replace all strings that contain a specific substring. So for example if I have this dataframe:

我想替换包含特定子字符串的所有字符串。例如,如果我有这个数据框:

import pandas as pd
df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'], 
                   'sport': ['tennis', 'football', 'basketball']})

I could replace football with the string 'ball sport' like this:

我可以用字符串 'ball sport' 替换足球,如下所示:

df.replace({'sport': {'football': 'ball sport'}})

What I want though is to replace everything that contains ball(in this case footballand basketball) with 'ball sport'. Something like this:

我想要的是用“球类运动”替换包含ball(在这种情况下footballbasketball)的所有内容。像这样的东西:

df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})

回答by EdChum

You can use str.containsto mask the rows that contain 'ball' and then overwrite with the new value:

您可以使用str.contains屏蔽包含“球”的行,然后用新值覆盖:

In [71]:
df.loc[df['sport'].str.contains('ball'), 'sport'] = 'ball sport'
df

Out[71]:
    name       sport
0    Bob      tennis
1   Jane  ball sport
2  Alice  ball sport

To make it case-insensitive pass `case=False:

要使其不区分大小写,传递`case=False:

df.loc[df['sport'].str.contains('ball', case=False), 'sport'] = 'ball sport'

回答by DeepSpace

You can use applywith a lambda. The xparameter of the lambda function will be each value in the 'sport' column:

您可以apply与 lambda 一起使用。xlambda 函数的参数将是“运动”列中的每个值:

df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)

回答by piRSquared

you can use str.replace

您可以使用 str.replace

df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')

0        tennis
1    ball sport
2    ball sport
Name: sport, dtype: object

reassign with

重新分配

df['sport'] = df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')
df

enter image description here

在此处输入图片说明

回答by Axis

A different str.contains

不同的 str.contains

 df['support'][df.name.str.contains('ball')] = 'ball support'