Python 列表推导式和函数式函数是否比“for 循环”更快？

Question

提问by Ericson Willians

In terms of performance in Python, is a list-comprehension, or functions like map(), filter()and reduce()faster than a for loop? Why, technically, they run in a C speed, while the for loop runs in the python virtual machine speed?.

就 Python 的性能而言，是列表推导式还是类似于map(),filter()并且reduce()比 for 循环更快的函数？为什么，从技术上讲，它们以 C 速度运行，而for 循环以 Python 虚拟机速度运行？。

Suppose that in a game that I'm developing I need to draw complex and huge maps using for loops. This question would be definitely relevant, for if a list-comprehension, for example, is indeed faster, it would be a much better option in order to avoid lags (Despite the visual complexity of the code).

假设在我正在开发的游戏中，我需要使用 for 循环绘制复杂而巨大的地图。这个问题肯定是相关的，例如，如果列表理解确实更快，那么为了避免滞后，这将是一个更好的选择（尽管代码的视觉复杂性）。

Answer 1

采纳答案by Anthony Kong

The following are rough guidelines and educated guesses based on experience. You should timeitor profile your concrete use case to get hard numbers, and those numbers may occasionally disagree with the below.

以下是粗略的指导方针和基于经验的有根据的猜测。您应该timeit或分析您的具体用例以获得硬性数字，这些数字有时可能与以下内容不一致。

A list comprehension is usually a tiny bit faster than the precisely equivalent forloop (that actually builds a list), most likely because it doesn't have to look up the list and its appendmethod on every iteration. However, a list comprehension still does a bytecode-level loop:

列表理解通常比精确等效的for循环（实际上构建一个列表）快一点，很可能是因为它不必append在每次迭代时查找列表及其方法。但是，列表推导式仍然执行字节码级别的循环：

>>> dis.dis(<the code object for `[x for x in range(10)]`>)
 1           0 BUILD_LIST               0
             3 LOAD_FAST                0 (.0)
       >>    6 FOR_ITER                12 (to 21)
             9 STORE_FAST               1 (x)
            12 LOAD_FAST                1 (x)
            15 LIST_APPEND              2
            18 JUMP_ABSOLUTE            6
       >>   21 RETURN_VALUE

Using a list comprehension in place of a loop that doesn'tbuild a list, nonsensically accumulating a list of meaningless values and then throwing the list away, is often slowerbecause of the overhead of creating and extending the list. List comprehensions aren't magic that is inherently faster than a good old loop.

由于创建和扩展列表的开销，使用列表推导式代替不构建列表的循环、无意义地累积无意义值列表然后丢弃列表通常会较慢。列表推导式并不是天生就比一个好的旧循环更快的魔法。

As for functional list processing functions: While these are written in C and probably outperform equivalent functions written in Python, they are notnecessarily the fastest option. Some speed up is expected ifthe function is written in C too. But most cases using a lambda(or other Python function), the overhead of repeatedly setting up Python stack frames etc. eats up any savings. Simply doing the same work in-line, without function calls (e.g. a list comprehension instead of mapor filter) is often slightly faster.

至于功能列表处理功能：虽然这些都是用C语言编写，并可能超越Python编写的相同的功能，它们是不是一定是最快的选择。如果该函数也是用 C 编写的，则预计会有一些加速。但是大多数情况下使用 a lambda（或其他 Python 函数），重复设置 Python 堆栈帧等的开销会消耗掉任何节省。简单地在线做同样的工作，没有函数调用（例如列表推导而不是mapor filter）通常会稍微快一点。

Suppose that in a game that I'm developing I need to draw complex and huge maps using for loops. This question would be definitely relevant, for if a list-comprehension, for example, is indeed faster, it would be a much better option in order to avoid lags (Despite the visual complexity of the code).

假设在我正在开发的游戏中，我需要使用 for 循环绘制复杂而巨大的地图。这个问题肯定是相关的，例如，如果列表理解确实更快，那么为了避免滞后，这将是一个更好的选择（尽管代码的视觉复杂性）。

Chances are, if code like this isn't already fast enough when written in good non-"optimized" Python, no amount of Python level micro optimization is going to make it fast enough and you should start thinking about dropping to C. While extensive micro optimizations can often speed up Python code considerably, there is a low (in absolute terms) limit to this. Moreover, even before you hit that ceiling, it becomes simply more cost efficient (15% speedup vs. 300% speed up with the same effort) to bite the bullet and write some C.

很有可能，如果这样的代码在用良好的非“优化”Python 编写时还不够快，那么再多的 Python 级别的微优化都不会让它足够快，你应该开始考虑使用 C 语言。虽然广泛微优化通常可以显着加快 Python 代码的速度，对此有一个较低的（绝对值）限制。此外，即使在你达到这个上限之前，咬紧牙关写一些 C 也变得更具成本效益（15% 加速比 300% 加速，同样的努力）。

Answer 2

回答by Anthony Kong

If you check the info on python.org, you can see this summary:

如果您查看python.org 上的信息，您可以看到以下摘要：

Version Time (seconds)
Basic loop 3.47
Eliminate dots 2.45
Local variable & no dots 1.79
Using map function 0.54

But you really shouldread the above article in details to understand the cause of the performance difference.

但是你真的应该详细阅读上面的文章来了解性能差异的原因。

I also strongly suggest you should time your code by using timeit. At the end of the day, there can be a situation where, for example, you may need to break out of forloop when a condition is met. It could potentially be faster than finding out the result by calling map.

我还强烈建议您应该使用timeit 为您的代码计时。在一天结束时，可能会出现这样的情况，例如，您可能需要在for满足条件时跳出循环。它可能比通过调用查找结果更快map。

Answer 3

回答by andreipmbcn

You ask specifically about map(), filter()and reduce(), but I assume you want to know about functional programming in general. Having tested this myself on the problem of computing distances between all points within a set of points, functional programming (using the starmapfunction from the built-in itertoolsmodule) turned out to be slightly slower than for-loops (taking 1.25 times as long, in fact). Here is the sample code I used:

您专门询问map(),filter()和reduce()，但我假设您想了解一般的函数式编程。在计算一组点内所有点之间的距离的问题上自己对此进行了测试后，函数式编程（使用starmap内置itertools模块中的函数）结果证明比 for 循环稍慢（需要 1.25 倍的时间，在事实）。这是我使用的示例代码：

import itertools, time, math, random

class Point:
    def __init__(self,x,y):
        self.x, self.y = x, y

point_set = (Point(0, 0), Point(0, 1), Point(0, 2), Point(0, 3))
n_points = 100
pick_val = lambda : 10 * random.random() - 5
large_set = [Point(pick_val(), pick_val()) for _ in range(n_points)]
    # the distance function
f_dist = lambda x0, x1, y0, y1: math.sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
    # go through each point, get its distance from all remaining points 
f_pos = lambda p1, p2: (p1.x, p2.x, p1.y, p2.y)

extract_dists = lambda x: itertools.starmap(f_dist, 
                          itertools.starmap(f_pos, 
                          itertools.combinations(x, 2)))

print('Distances:', list(extract_dists(point_set)))

t0_f = time.time()
list(extract_dists(large_set))
dt_f = time.time() - t0_f

Is the functional version faster than the procedural version?

功能版本比程序版本快吗？

def extract_dists_procedural(pts):
    n_pts = len(pts)
    l = []    
    for k_p1 in range(n_pts - 1):
        for k_p2 in range(k_p1, n_pts):
            l.append((pts[k_p1].x - pts[k_p2].x) ** 2 +
                     (pts[k_p1].y - pts[k_p2].y) ** 2)
    return l

t0_p = time.time()
list(extract_dists_procedural(large_set)) 
    # using list() on the assumption that
    # it eats up as much time as in the functional version

dt_p = time.time() - t0_p

f_vs_p = dt_p / dt_f
if f_vs_p >= 1.0:
    print('Time benefit of functional progamming:', f_vs_p, 
          'times as fast for', n_points, 'points')
else:
    print('Time penalty of functional programming:', 1 / f_vs_p, 
          'times as slow for', n_points, 'points')

Answer 4

回答by alphiii

I wrote a simple script that test the speed and this is what I found out. Actually for loop was fastest in my case. That really suprised me, check out bellow (was calculating sum of squares).

我写了一个简单的脚本来测试速度，这就是我发现的。实际上 for 循环在我的情况下是最快的。这真的让我感到惊讶，请查看下面的内容（正在计算平方和）。

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        i = i**2
        a += i
    return a

def square_sum3(numbers):
    sqrt = lambda x: x**2
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([int(i)**2 for i in numbers]))


time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

0:00:00.302000 #Reduce
0:00:00.144000 #For loop
0:00:00.318000 #Map
0:00:00.390000 #List comprehension

Answer 5

回答by jjmerelo

Adding a twist to Alphii answer, actually the for loop would be second best and about 6 times slower than map

为Alphii 答案添加一个转折，实际上 for 循环将是第二好的并且比它慢 6 倍map

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        a += i**2
    return a

def square_sum3(numbers):
    a = 0
    map(lambda x: a+x**2, numbers)
    return a

def square_sum4(numbers):
    a = 0
    return [a+i**2 for i in numbers]

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

Main changes have been to eliminate the slow sumcalls, as well as the probably unnecessary int()in the last case. Putting the for loop and map in the same terms makes it quite fact, actually. Remember that lambdas are functional concepts and theoretically shouldn't have side effects, but, well, they canhave side effects like adding to a. Results in this case with Python 3.6.1, Ubuntu 14.04, Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

主要的变化是消除了缓慢的sum调用，以及int()在最后一种情况下可能不必要的调用。实际上，将 for 循环和 map 放在相同的术语中使其成为事实。请记住，lambda 是函数式概念，理论上不应该有副作用，但是，它们可能会产生副作用，例如添加到a. 在这种情况下，Python 3.6.1、Ubuntu 14.04、Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz 的结果

0:00:00.257703 #Reduce
0:00:00.184898 #For loop
0:00:00.031718 #Map
0:00:00.212699 #List comprehension

Answer 6

回答by Alisca Chen

I have managed to modify some of @alpiii'scode and discovered that List comprehension is a little faster than for loop. It might be caused by int(), it is not fair between list comprehension and for loop.

我设法修改了@alpiii 的一些代码，发现列表理解比 for 循环快一点。这可能是由于int()，列表理解和 for 循环之间不公平。

from functools import reduce
import datetime

def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next*next, numbers, 0)

def square_sum2(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def square_sum3(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([i*i for i in numbers]))

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

0:00:00.101122 #Reduce

0:00:00.089216 #For loop

0:00:00.101532 #Map

0:00:00.068916 #List comprehension

Answer 7

回答by tjysdsg

I modified @Alisa's codeand used cProfileto show why list comprehension is faster:

我修改了@Alisa 的代码并用来cProfile说明为什么列表理解更快：

from functools import reduce
import datetime

def reduce_(numbers):
    return reduce(lambda sum, next: sum + next * next, numbers, 0)

def for_loop(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def map_(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def list_comp(numbers):
    return(sum([i*i for i in numbers]))

funcs = [
        reduce_,
        for_loop,
        map_,
        list_comp
        ]

if __name__ == "__main__":
    # [1, 2, 5, 3, 1, 2, 5, 3]
    import cProfile
    for f in funcs:
        print('=' * 25)
        print("Profiling:", f.__name__)
        print('=' * 25)
        pr = cProfile.Profile()
        for i in range(10**6):
            pr.runcall(f, [1, 2, 5, 3, 1, 2, 5, 3])
        pr.create_stats()
        pr.print_stats()

Here's the results:

结果如下：

=========================
Profiling: reduce_
=========================
         11000000 function calls in 1.501 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.162    0.000    1.473    0.000 profiling.py:4(reduce_)
  8000000    0.461    0.000    0.461    0.000 profiling.py:5(<lambda>)
  1000000    0.850    0.000    1.311    0.000 {built-in method _functools.reduce}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: for_loop
=========================
         11000000 function calls in 1.372 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.879    0.000    1.344    0.000 profiling.py:7(for_loop)
  1000000    0.145    0.000    0.145    0.000 {built-in method builtins.sum}
  8000000    0.320    0.000    0.320    0.000 {method 'append' of 'list' objects}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: map_
=========================
         11000000 function calls in 1.470 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.264    0.000    1.442    0.000 profiling.py:14(map_)
  8000000    0.387    0.000    0.387    0.000 profiling.py:15(<lambda>)
  1000000    0.791    0.000    1.178    0.000 {built-in method builtins.sum}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: list_comp
=========================
         4000000 function calls in 0.737 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.318    0.000    0.709    0.000 profiling.py:18(list_comp)
  1000000    0.261    0.000    0.261    0.000 profiling.py:19(<listcomp>)
  1000000    0.131    0.000    0.131    0.000 {built-in method builtins.sum}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}

IMHO:

恕我直言：

reduceand mapare in general pretty slow. Not only that, using sumon the iterators that mapreturned is slow, compared to suming a list
for_loopuses append, which is of course slow to some extent
list-comprehension not only spent the least time building the list, it also makes summuch quicker, in contrast to map

reduce并且map通常很慢。不仅如此，与使用列表相比sum，在map返回的迭代器上使用速度很慢sum
for_loop使用 append，这当然在某种程度上很慢
与列表理解相比，列表理解不仅花费最少的时间构建列表，而且sum速度更快map

Python 列表推导式和函数式函数是否比“for 循环”更快？

提问by Ericson Willians

采纳答案by Anthony Kong

回答by Anthony Kong

回答by andreipmbcn

回答by alphiii

回答by jjmerelo

回答by Alisca Chen

回答by tjysdsg

相关推荐

最近更新

标签

Python 列表推导式和函数式函数是否比“for 循环”更快？

提问by Ericson Willians

采纳答案by Anthony Kong

回答by Anthony Kong

回答by andreipmbcn

回答by alphiii

回答by jjmerelo

回答by Alisca Chen

回答by tjysdsg

相关推荐

Python 仅依赖于 NumPy/SciPy 的二次程序 (QP) 求解器？

Python 使用熊猫/数据框计算加权平均值

Python 你如何按多列过滤熊猫数据框

Python “内容”和“文本”有什么区别

相关推荐

最近更新

标签