pandas 如何从熊猫数据框中的列中删除字符串值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33413249/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:07:13  来源:igfitidea点击:

How to remove string value from column in pandas dataframe

pythonregexpandaslambdadataframe

提问by sequence_hard

I am trying to write some code that splits a string in a dataframe column at comma (so it becomes a list) and removes a certain string from that list if it is present. after removing the unwanted string I want to join the list elements again at comma. My dataframe looks like this:

我正在尝试编写一些代码,以逗号分隔数据帧列中的字符串(因此它成为一个列表),并从该列表中删除某个字符串(如果存在)。删除不需要的字符串后,我想以逗号再次加入列表元素。我的数据框如下所示:

df:

   Column1  Column2
0      a       a,b,c
1      y       b,n,m
2      d       n,n,m
3      d       b,b,x

So basically my goal is to remove all b values from column2 so that I get:

所以基本上我的目标是从 column2 中删除所有 b 值,以便我得到:

df:

df:

   Column1  Column2
0      a       a,c
1      y       n,m
2      d       n,n,m
3      d       x

The code I have written is the following:

我写的代码如下:

df=df['Column2'].apply(lambda x: x.split(','))

def exclude_b(df):
    for index, liste in df['column2].iteritems():
        if 'b' in liste:
            liste.remove('b')
            return liste
        else:
            return liste

The first row splits all the values in the column into a comma separated list. with the function now I tried to iterate through all the lists and remove the b if present, if it is not present return the list as it is. If I print 'liste' at the end it only returns the first row of Column2, but not the others. What am I doing wrong? And would there be a way to implement my if condition into a lambda function?

第一行将列中的所有值拆分为逗号分隔的列表。现在使用该函数,我尝试遍历所有列表并删除 b(如果存在),如果它不存在,则按原样返回列表。如果我在最后打印“liste”,它只返回 Column2 的第一行,而不返回其他行。我究竟做错了什么?有没有办法将我的 if 条件实现到 lambda 函数中?

回答by Nader Hisham

simply you can apply the regex b,?, which means replace any value of band ,found after the bif exists

简单地您可以应用正则表达式b,?,这意味着替换if 存在 之后找到的b和 的任何值,b

df['Column2'] = df.Column2.str.replace('b,?' , '')

Out[238]:
Column1 Column2
0   a   a,c
1   y   n,m
2   d   n,n,m
3   d   x