Python pandas:删除字符串中分隔符后的所有内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40705480/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:52:18  来源:igfitidea点击:

Python pandas: remove everything after a delimiter in a string

pythonpython-3.xpandas

提问by f0rd42

I have data frames which contain e.g.:

我有包含例如的数据框:

"vendor a::ProductA"
"vendor b::ProductA
"vendor a::Productb"

I need to remove everything (and including) the two :: so that I end up with:

我需要删除所有(包括)两个 :: 以便我最终得到:

"vendor a"
"vendor b"
"vendor a"

I tried str.trim (which seems to not exist) and str.split without success. what would be the easiest way to accomplish this?

我尝试了 str.trim (似乎不存在)和 str.split 没有成功。实现这一目标的最简单方法是什么?

回答by blacksite

You can use pandas.Series.str.splitjust like you would use splitnormally. Just split on the string '::', and index the list that's created from the splitmethod:

您可以pandas.Series.str.split像平常一样使用split。只需拆分 string '::',并索引从该split方法创建的列表:

>>> df = pd.DataFrame({'text': ["vendor a::ProductA", "vendor b::ProductA", "vendor a::Productb"]})
>>> df
                 text
0  vendor a::ProductA
1  vendor b::ProductA
2  vendor a::Productb
>>> df['text_new'] = df['text'].str.split('::').str[0]
>>> df
                 text  text_new
0  vendor a::ProductA  vendor a
1  vendor b::ProductA  vendor b
2  vendor a::Productb  vendor a

Here's a non-pandas solution:

这是一个非熊猫解决方案:

>>> df['text_new1'] = [x.split('::')[0] for x in df['text']]
>>> df
                 text  text_new text_new1
0  vendor a::ProductA  vendor a  vendor a
1  vendor b::ProductA  vendor b  vendor b
2  vendor a::Productb  vendor a  vendor a

Edit: Here's the step-by-step explanation of what's happening in pandasabove:

编辑:这是对pandas上面发生的事情的分步说明:

# Select the pandas.Series object you want
>>> df['text']
0    vendor a::ProductA
1    vendor b::ProductA
2    vendor a::Productb
Name: text, dtype: object

# using pandas.Series.str allows us to implement "normal" string methods 
# (like split) on a Series
>>> df['text'].str
<pandas.core.strings.StringMethods object at 0x110af4e48>

# Now we can use the split method to split on our '::' string. You'll see that
# a Series of lists is returned (just like what you'd see outside of pandas)
>>> df['text'].str.split('::')
0    [vendor a, ProductA]
1    [vendor b, ProductA]
2    [vendor a, Productb]
Name: text, dtype: object

# using the pandas.Series.str method, again, we will be able to index through
# the lists returned in the previous step
>>> df['text'].str.split('::').str
<pandas.core.strings.StringMethods object at 0x110b254a8>

# now we can grab the first item in each list above for our desired output
>>> df['text'].str.split('::').str[0]
0    vendor a
1    vendor b
2    vendor a
Name: text, dtype: object

I would suggest checking out the pandas.Series.str docs, or, better yet, Working with Text Data in pandas.

我建议查看pandas.Series.str 文档,或者更好的是,在 pandas 中使用文本数据

回答by Mohamed AL ANI

You can use str.replace(":", " ")to remove the "::". To split, you need to specify the character you want to split into: str.split(" ")

您可以使用str.replace(":", " ")删除"::". 要拆分,您需要指定要拆分的字符:str.split(" ")

The trim function is called strip in python: str.strip()

修剪函数在python中称为strip: str.strip()

Also, you can do str[:7]to get just "vendor x"in your strings.

此外,你可以做到str[:7]"vendor x"在你的字符串中。

Good luck

祝你好运

回答by Suraj Gowda

If it is in a specific column (having name: column)of a data frame (having name: dataframe), you can also use

如果它位于数据框(具有名称:数据框的特定列(具有名称:列)中,您还可以使用

dataframe.column.str.replace("(::).*","")

It gives you the below result

它为您提供以下结果

         column        new_column       
0  vendor a::ProductA  vendor a
1  vendor b::ProductA  vendor b
2  vendor a::Productb  vendor a

By using this you need not specify any position, as it gets rid of anything present after '::'

通过使用它,您无需指定任何位置,因为它可以消除“ ::”之后存在的任何内容

I guess this might come oh help,Good luck!

我想这可能会有帮助,祝你好运!

回答by Ali tariq

there is your function:

有你的功能:

def do_it(str):
  integer=0
  while integer<len(str):
      if str[integer]==':' :
        if str[integer+1]==':' :
          str=str.split(':')[0]
          break;
      integer=integer+1    
  return (str)

pass the original string in there. And get the new trimmed string.

在那里传递原始字符串。并获得新的修剪过的字符串。