Python pandas:删除字符串中分隔符后的所有内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40705480/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas: remove everything after a delimiter in a string
提问by f0rd42
I have data frames which contain e.g.:
我有包含例如的数据框:
"vendor a::ProductA"
"vendor b::ProductA
"vendor a::Productb"
I need to remove everything (and including) the two :: so that I end up with:
我需要删除所有(包括)两个 :: 以便我最终得到:
"vendor a"
"vendor b"
"vendor a"
I tried str.trim (which seems to not exist) and str.split without success. what would be the easiest way to accomplish this?
我尝试了 str.trim (似乎不存在)和 str.split 没有成功。实现这一目标的最简单方法是什么?
回答by blacksite
You can use pandas.Series.str.split
just like you would use split
normally. Just split on the string '::'
, and index the list that's created from the split
method:
您可以pandas.Series.str.split
像平常一样使用split
。只需拆分 string '::'
,并索引从该split
方法创建的列表:
>>> df = pd.DataFrame({'text': ["vendor a::ProductA", "vendor b::ProductA", "vendor a::Productb"]})
>>> df
text
0 vendor a::ProductA
1 vendor b::ProductA
2 vendor a::Productb
>>> df['text_new'] = df['text'].str.split('::').str[0]
>>> df
text text_new
0 vendor a::ProductA vendor a
1 vendor b::ProductA vendor b
2 vendor a::Productb vendor a
Here's a non-pandas solution:
这是一个非熊猫解决方案:
>>> df['text_new1'] = [x.split('::')[0] for x in df['text']]
>>> df
text text_new text_new1
0 vendor a::ProductA vendor a vendor a
1 vendor b::ProductA vendor b vendor b
2 vendor a::Productb vendor a vendor a
Edit: Here's the step-by-step explanation of what's happening in pandas
above:
编辑:这是对pandas
上面发生的事情的分步说明:
# Select the pandas.Series object you want
>>> df['text']
0 vendor a::ProductA
1 vendor b::ProductA
2 vendor a::Productb
Name: text, dtype: object
# using pandas.Series.str allows us to implement "normal" string methods
# (like split) on a Series
>>> df['text'].str
<pandas.core.strings.StringMethods object at 0x110af4e48>
# Now we can use the split method to split on our '::' string. You'll see that
# a Series of lists is returned (just like what you'd see outside of pandas)
>>> df['text'].str.split('::')
0 [vendor a, ProductA]
1 [vendor b, ProductA]
2 [vendor a, Productb]
Name: text, dtype: object
# using the pandas.Series.str method, again, we will be able to index through
# the lists returned in the previous step
>>> df['text'].str.split('::').str
<pandas.core.strings.StringMethods object at 0x110b254a8>
# now we can grab the first item in each list above for our desired output
>>> df['text'].str.split('::').str[0]
0 vendor a
1 vendor b
2 vendor a
Name: text, dtype: object
I would suggest checking out the pandas.Series.str docs, or, better yet, Working with Text Data in pandas.
我建议查看pandas.Series.str 文档,或者更好的是,在 pandas 中使用文本数据。
回答by Mohamed AL ANI
You can use str.replace(":", " ")
to remove the "::"
.
To split, you need to specify the character you want to split into: str.split(" ")
您可以使用str.replace(":", " ")
删除"::"
. 要拆分,您需要指定要拆分的字符:str.split(" ")
The trim function is called strip in python: str.strip()
修剪函数在python中称为strip: str.strip()
Also, you can do str[:7]
to get just "vendor x"
in your strings.
此外,你可以做到str[:7]
只"vendor x"
在你的字符串中。
Good luck
祝你好运
回答by Suraj Gowda
If it is in a specific column (having name: column)of a data frame (having name: dataframe), you can also use
如果它位于数据框(具有名称:数据框)的特定列(具有名称:列)中,您还可以使用
dataframe.column.str.replace("(::).*","")
It gives you the below result
它为您提供以下结果
column new_column
0 vendor a::ProductA vendor a
1 vendor b::ProductA vendor b
2 vendor a::Productb vendor a
By using this you need not specify any position, as it gets rid of anything present after '::'
通过使用它,您无需指定任何位置,因为它可以消除“ ::”之后存在的任何内容
I guess this might come oh help,Good luck!
我想这可能会有帮助,祝你好运!
回答by Ali tariq
there is your function:
有你的功能:
def do_it(str):
integer=0
while integer<len(str):
if str[integer]==':' :
if str[integer+1]==':' :
str=str.split(':')[0]
break;
integer=integer+1
return (str)
pass the original string in there. And get the new trimmed string.
在那里传递原始字符串。并获得新的修剪过的字符串。