如何从 Python 中的一组字符串中删除特定的子字符串？

Question

提问by controlfreak

I have a set of strings set1, and all the strings in set1have a two specific substrings which I don't need and want to remove.
Sample Input: set1={'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}
So basically I want the .goodand .badsubstrings removed from all the strings.
What I tried:

我有一组 strings set1，并且其中的所有字符串set1都有两个特定的子字符串，我不需要并且想要删除它们。
示例输入： set1={'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}
所以基本上我希望从所有字符串中删除.good和.bad子字符串。
我试过的：

for x in set1:
    x.replace('.good','')
    x.replace('.bad','')

But this doesn't seem to work at all. There is absolutely no change in the output and it is the same as the input. I tried using for x in list(set1)instead of the original one but that doesn't change anything.

但这似乎根本不起作用。输出绝对没有变化，它与输入相同。我尝试使用for x in list(set1)而不是原始的，但这并没有改变任何东西。

Answer 1

回答by Reut Sharabani

Strings are immutable. string.replace(python 2.x) or str.replace(python 3.x) creates a newstring. This is stated in the documentation:

字符串是不可变的。string.replace(python 2.x) 或str.replace(python 3.x) 创建一个新字符串。这在文档中说明：

Return a copyof string s with all occurrences of substring old replaced by new. ...

返回字符串 s的副本，其中所有出现的子字符串 old 都被 new 替换。...

This means you have to re-allocate the set or re-populate it (re-allocating is easier with set comprehension):

这意味着您必须重新分配集合或重新填充它（使用集合理解重新分配更容易）：

new_set = {x.replace('.good', '').replace('.bad', '') for x in set1}

Answer 2

回答by Alex Hall

>>> x = 'Pear.good'
>>> y = x.replace('.good','')
>>> y
'Pear'
>>> x
'Pear.good'

.replacedoesn't changethe string, it returns a copy of the string with the replacement. You can't change the string directly because strings are immutable.

.replace不更改字符串，它返回带有替换字符串的副本。您不能直接更改字符串，因为字符串是不可变的。

You need to take the return values from x.replaceand put them in a new set.

您需要从中获取返回值x.replace并将它们放入新集合中。

Answer 3

回答by gueeest

All you need is a bit of black magic!

你所需要的只是一点黑魔法！

>>> a = ["cherry.bad","pear.good", "apple.good"]
>>> a = list(map(lambda x: x.replace('.good','').replace('.bad',''),a))
>>> a
['cherry', 'pear', 'apple']

Answer 4

回答by Vivek

You could do this:

你可以这样做：

import re
import string
set1={'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}

for x in set1:
    x.replace('.good',' ')
    x.replace('.bad',' ')
    x = re.sub('\.good$', '', x)
    x = re.sub('\.bad$', '', x)
    print(x)

Answer 5

回答by user140259

I did the test (but it is not your example) and the data does not return them orderly or complete

我做了测试（但它不是你的例子）并且数据没有有序或完整地返回它们

>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> newind = {x.replace('p','') for x in ind}
>>> newind
{'1', '2', '8', '5', '4'}

I proved that this works:

我证明这是有效的：

>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> newind = [x.replace('p','') for x in ind]
>>> newind
['5', '1', '8', '4', '2', '8']

or

或者

>>> newind = []
>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> for x in ind:
...     newind.append(x.replace('p',''))
>>> newind
['5', '1', '8', '4', '2', '8']

Answer 6

回答by cs95

When there are multiple substrings to remove, one simple and effective option is to use re.subwith a compiled pattern that involves joining all the substrings-to-remove using the regex OR (|) pipe.

当要删除多个子字符串时，一个简单而有效的选择是使用re.sub编译模式，该模式涉及使用正则表达式 OR ( |) 管道连接所有要删除的子字符串。

import re

to_remove = ['.good', '.bad']
strings = ['Apple.good','Orange.good','Pear.bad']

p = re.compile('|'.join(map(re.escape, to_remove))) # escape to handle metachars
[p.sub('', s) for s in strings]
# ['Apple', 'Orange', 'Pear']

Answer 7

回答by rsc05

If list

如果列出

I was doing something for a list which is a set of strings and you want to remove all lines that have a certain substring you can do this

我正在为一个列表做一些事情，它是一组字符串，你想删除所有具有某个子字符串的行，你可以这样做

import re
def RemoveInList(sub,LinSplitUnOr):
    indices = [i for i, x in enumerate(LinSplitUnOr) if re.search(sub, x)]
    A = [i for j, i in enumerate(LinSplitUnOr) if j not in indices]
    return A

where subis a patter that you do not wish to have in a list of lines LinSplitUnOr

sub您不希望在行列表中出现的模式在哪里LinSplitUnOr

for example

例如

A=['Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad']
sub = 'good'
A=RemoveInList(sub,A)

Then Awill be

然后A将是

如何从 Python 中的一组字符串中删除特定的子字符串？

提问by controlfreak

回答by Reut Sharabani

回答by Alex Hall

回答by gueeest

回答by Vivek

回答by user140259

回答by cs95

回答by rsc05

If list

如果列出

相关推荐

最近更新

标签

如何从 Python 中的一组字符串中删除特定的子字符串？

提问by controlfreak

回答by Reut Sharabani

回答by Alex Hall

回答by gueeest

回答by Vivek

回答by user140259

回答by cs95

回答by rsc05

If list

如果列出

相关推荐

如果 OS Python 版本是 3.5，如何设置 pipenv Python 3.6 项目？

如何在 Python 中将彩色输出打印到终端？

Python 可迭代原始文本文档，收到字符串对象

Python TensorFlow：在我自己的图像上训练

相关推荐

最近更新

标签