Python 从字符串列表中删除空字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3845423/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove empty strings from a list of strings
提问by zerodx
I want to remove all empty strings from a list of strings in python.
我想从 python 中的字符串列表中删除所有空字符串。
My idea looks like this:
我的想法是这样的:
while '' in str_list:
str_list.remove('')
Is there any more pythonic way to do this?
有没有更多的pythonic方法来做到这一点?
采纳答案by livibetter
I would use filter:
我会用filter:
str_list = filter(None, str_list)
str_list = filter(bool, str_list)
str_list = filter(len, str_list)
str_list = filter(lambda item: item, str_list)
Python 3 returns an iterator from filter, so should be wrapped in a call to list()
Python 3 从 返回一个迭代器filter,因此应该包含在对list()
str_list = list(filter(None, str_list))
回答by Ib33X
Using a list comprehensionis the most Pythonic way:
使用列表推导式是最 Pythonic 的方式:
>>> strings = ["first", "", "second"]
>>> [x for x in strings if x]
['first', 'second']
If the list must be modified in-place, because there are other references which must see the updated data, then use a slice assignment:
如果必须就地修改列表,因为还有其他引用必须看到更新的数据,则使用切片赋值:
strings[:] = [x for x in strings if x]
回答by Andrew Jaffe
Depending on the size of your list, it may be most efficient if you use list.remove() rather than create a new list:
根据列表的大小,如果使用 list.remove() 而不是创建新列表可能是最有效的:
l = ["1", "", "3", ""]
while True:
try:
l.remove("")
except ValueError:
break
This has the advantage of not creating a new list, but the disadvantage of having to search from the beginning each time, although unlike using while '' in las proposed above, it only requires searching once per occurrence of ''(there is certainly a way to keep the best of both methods, but it is more complicated).
这样做的优点是不创建新列表,但缺点是每次都必须从头开始搜索,虽然与while '' in l上面建议的使用不同,它只需要每次出现搜索一次''(当然有一种方法可以保持最好的两种方法,但比较复杂)。
回答by Ivo van der Wijk
filter actually has a special option for this:
filter 实际上有一个特殊的选项:
filter(None, sequence)
It will filter out all elements that evaluate to False. No need to use an actual callable here such as bool, len and so on.
它将过滤掉所有评估为 False 的元素。无需在此处使用实际的可调用对象,例如 bool、len 等。
It's equally fast as map(bool, ...)
它与 map(bool, ...) 一样快
回答by Aamir Mushtaq
Use filter:
使用filter:
newlist=filter(lambda x: len(x)>0, oldlist)
The drawbacks of using filter as pointed out is that it is slower than alternatives; also, lambdais usually costly.
正如所指出的,使用过滤器的缺点是它比替代方法慢;此外,lambda通常是昂贵的。
Or you can go for the simplest and the most iterative of all:
或者你可以选择最简单和最迭代的:
# I am assuming listtext is the original list containing (possibly) empty items
for item in listtext:
if item:
newlist.append(str(item))
# You can remove str() based on the content of your original list
this is the most intuitive of the methods and does it in decent time.
这是最直观的方法,并且在合适的时间完成。
回答by thiruvenkadam
Instead of if x, I would use if X != '' in order to just eliminate empty strings. Like this:
而不是 if x,我会使用 if X != '' 来消除空字符串。像这样:
str_list = [x for x in str_list if x != '']
This will preserve None data type within your list. Also, in case your list has integers and 0 is one among them, it will also be preserved.
这将在您的列表中保留 None 数据类型。此外,如果您的列表有整数并且 0 是其中之一,它也会被保留。
For example,
例如,
str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]
回答by Aziz Alto
>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(None, lstr)
['hello', ' ', 'world', ' ']
Compare time
比较时间
>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656
Notice that filter(None, lstr)does not remove empty strings with a space ' ', it only prunes away ''while ' '.join(lstr).split()removes both.
请注意,filter(None, lstr)不会删除带有空格的空字符串' ',它只会删除''同时' '.join(lstr).split()删除两者。
To use filter()with white space strings removed, it takes a lot more time:
要filter()在删除空白字符串的情况下使用,需要更多时间:
>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635
回答by ssi-anik
Reply from @Ib33X is awesome. If you want to remove every empty string, after stripped. you need to use the strip method too. Otherwise, it will return the empty string too if it has white spaces. Like, " " will be valid too for that answer. So, can be achieved by.
@Ib33X 的回复很棒。如果要删除每个空字符串,请在剥离后。你也需要使用strip方法。否则,如果它有空格,它也会返回空字符串。就像,“”对于那个答案也是有效的。所以,可以通过。
strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]
The answer for this will be ["first", "second"].
If you want to use filtermethod instead, you can do like
list(filter(lambda item: item.strip(), strings)). This is give the same result.
答案将是["first", "second"]。
如果你想改用filter方法,你可以像
list(filter(lambda item: item.strip(), strings)). 这是给出相同的结果。
回答by Paolo Melchiorre
As reported by Aziz Altofilter(None, lstr)does not remove empty strings with a space ' 'but if you are sure lstr contains only string you can use filter(str.strip, lstr)
正如Aziz Alto所报告的,filter(None, lstr)不会删除带有空格的空字符串,' '但如果您确定 lstr 只包含您可以使用的字符串filter(str.strip, lstr)
>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(str.strip, lstr)
['hello', 'world']
Compare time on my pc
比较我电脑上的时间
>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.356455087661743
>>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
5.276503801345825
The fastest solution to remove ''and empty strings with a space ' 'remains ' '.join(lstr).split().
删除''和空字符串的最快解决方案' '仍然是' '.join(lstr).split()。
As reported in a comment the situation is different if your strings contain spaces.
正如评论中所报告的,如果您的字符串包含空格,则情况会有所不同。
>>> lstr = ['hello', '', ' ', 'world', ' ', 'see you']
>>> lstr
['hello', '', ' ', 'world', ' ', 'see you']
>>> ' '.join(lstr).split()
['hello', 'world', 'see', 'you']
>>> filter(str.strip, lstr)
['hello', 'world', 'see you']
You can see that filter(str.strip, lstr)preserve strings with spaces on it but ' '.join(lstr).split()will split this strings.
您可以看到filter(str.strip, lstr)保留带有空格的字符串,但' '.join(lstr).split()会拆分此字符串。
回答by ankostis
Sum up best answers:
总结最佳答案:
1. Eliminate emtpties WITHOUT stripping:
1. 无需剥离即可消除空位:
That is, all-space strings are retained:
也就是说,保留所有空格的字符串:
slist = list(filter(None, slist))
PROs:
优点:
- simplest;
- fastest (see benchmarks below).
- 最简单;
- 最快(见下面的基准)。
2. To eliminate empties after stripping ...
2. 去除剥离后的空...
2.a ... when strings do NOT contain spaces between words:
2.a ...当字符串不包含单词之间的空格时:
slist = ' '.join(slist).split()
PROs:
优点:
- small code
- fast (BUT not fastest with big datasets due to memory, contrary to what @paolo-melchiorre results)
- 小代码
- 快(但由于内存原因,大数据集不是最快的,与@paolo-melchiorre 结果相反)
2.b ... when strings contain spaces between words?
2.b ...当字符串在单词之间包含空格时?
slist = list(filter(str.strip, slist))
PROs:
优点:
- fastest;
- understandability of the code.
- 最快的;
- 代码的可理解性。
Benchmarks on a 2018 machine:
2018 年机器的基准测试:
## Build test-data
#
import random, string
nwords = 10000
maxlen = 30
null_ratio = 0.1
rnd = random.Random(0) # deterministic results
words = [' ' * rnd.randint(0, maxlen)
if rnd.random() > (1 - null_ratio)
else
''.join(random.choices(string.ascii_letters, k=rnd.randint(0, maxlen)))
for _i in range(nwords)
]
## Test functions
#
def nostrip_filter(slist):
return list(filter(None, slist))
def nostrip_comprehension(slist):
return [s for s in slist if s]
def strip_filter(slist):
return list(filter(str.strip, slist))
def strip_filter_map(slist):
return list(filter(None, map(str.strip, slist)))
def strip_filter_comprehension(slist): # waste memory
return list(filter(None, [s.strip() for s in slist]))
def strip_filter_generator(slist):
return list(filter(None, (s.strip() for s in slist)))
def strip_join_split(slist): # words without(!) spaces
return ' '.join(slist).split()
## Benchmarks
#
%timeit nostrip_filter(words)
142 μs ± 16.8 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit nostrip_comprehension(words)
263 μs ± 19.1 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit strip_filter(words)
653 μs ± 37.5 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit strip_filter_map(words)
642 μs ± 36 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit strip_filter_comprehension(words)
693 μs ± 42.2 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit strip_filter_generator(words)
750 μs ± 28.6 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit strip_join_split(words)
796 μs ± 103 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

