Python 根据布尔值列表过滤列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18665873/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Filtering a list based on a list of booleans
提问by Gabriel
I have a list of values which I need to filter given the values in a list of booleans:
我有一个值列表,我需要根据布尔值列表中的值对其进行过滤:
list_a = [1, 2, 4, 6]
filter = [True, False, True, False]
I generate a new filtered list with the following line:
我使用以下行生成一个新的过滤列表:
filtered_list = [i for indx,i in enumerate(list_a) if filter[indx] == True]
which results in:
这导致:
print filtered_list
[1,4]
The line works but looks (to me) a bit overkill and I was wondering if there was a simpler way to achieve the same.
这条线有效,但看起来(对我来说)有点矫枉过正,我想知道是否有更简单的方法来实现相同的目标。
Advices
建议
Summary of two good advices given in the answers below:
以下答案中给出的两个好建议的摘要:
1- Don't name a list filter
like I did because it is a built-in function.
1- 不要filter
像我一样命名一个列表,因为它是一个内置函数。
2- Don't compare things to True
like I did with if filter[idx]==True..
since it's unnecessary. Just using if filter[idx]
is enough.
2-不要True
像我那样比较事情,if filter[idx]==True..
因为这是不必要的。只要使用if filter[idx]
就足够了。
采纳答案by Ashwini Chaudhary
You're looking for itertools.compress
:
您正在寻找itertools.compress
:
>>> from itertools import compress
>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> list(compress(list_a, fil))
[1, 4]
Timing comparisons(py3.x):
时序比较(py3.x):
>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> %timeit list(compress(list_a, fil))
100000 loops, best of 3: 2.58 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v] #winner
100000 loops, best of 3: 1.98 us per loop
>>> list_a = [1, 2, 4, 6]*100
>>> fil = [True, False, True, False]*100
>>> %timeit list(compress(list_a, fil)) #winner
10000 loops, best of 3: 24.3 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]
10000 loops, best of 3: 82 us per loop
>>> list_a = [1, 2, 4, 6]*10000
>>> fil = [True, False, True, False]*10000
>>> %timeit list(compress(list_a, fil)) #winner
1000 loops, best of 3: 1.66 ms per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]
100 loops, best of 3: 7.65 ms per loop
Don't use filter
as a variable name, it is a built-in function.
不要filter
用作变量名,它是一个内置函数。
回答by Bas Swinckels
Like so:
像这样:
filtered_list = [i for (i, v) in zip(list_a, filter) if v]
Using zip
is the pythonicway to iterate over multiple sequences in parallel, without needing any indexing. This assumes both sequences have the same length (zip stops after the shortest runs out). Using itertools
for such a simple case is a bit overkill ...
Usingzip
是Pythonic并行迭代多个序列的方式,不需要任何索引。这假设两个序列具有相同的长度(在最短的用完后拉链停止)。使用itertools
这种简单的情况是有点矫枉过正?
One thing you do in your example you should really stop doing is comparing things to True, this is usually not necessary. Instead of if filter[idx]==True: ...
, you can simply write if filter[idx]: ...
.
您在示例中应该真正停止做的一件事是将事物与 True 进行比较,这通常是没有必要的。取而代之的是if filter[idx]==True: ...
,您可以简单地编写if filter[idx]: ...
.
回答by Alex Szatmary
To do this using numpy, ie, if you have an array, a
, instead of list_a
:
要使用 numpy 执行此操作,即,如果您有一个数组,则为a
, 而不是list_a
:
a = np.array([1, 2, 4, 6])
my_filter = np.array([True, False, True, False], dtype=bool)
a[my_filter]
> array([1, 4])
回答by Hammer
With numpy:
与麻木:
In [128]: list_a = np.array([1, 2, 4, 6])
In [129]: filter = np.array([True, False, True, False])
In [130]: list_a[filter]
Out[130]: array([1, 4])
or see Alex Szatmary's answer if list_a can be a numpy array but not filter
或者如果 list_a 可以是一个 numpy 数组但不是过滤器,请参阅 Alex Szatmary 的回答
Numpy usually gives you a big speed boost as well
Numpy 通常也会给你一个很大的速度提升
In [133]: list_a = [1, 2, 4, 6]*10000
In [134]: fil = [True, False, True, False]*10000
In [135]: list_a_np = np.array(list_a)
In [136]: fil_np = np.array(fil)
In [139]: %timeit list(itertools.compress(list_a, fil))
1000 loops, best of 3: 625 us per loop
In [140]: %timeit list_a_np[fil_np]
10000 loops, best of 3: 173 us per loop
回答by Daniel Braun
filtered_list = [list_a[i] for i in range(len(list_a)) if filter[i]]
回答by Franklin'j Gil'z
With python 3 you can use list_a[filter]
to get True
values. To get False
values use list_a[~filter]
使用 python 3,您可以list_a[filter]
用来获取True
值。获取False
值使用list_a[~filter]