Python 从 cProfile 获得有意义的结果

Question

提问by Bryce Thomas

I have a Python script in a file which takes just over 30 seconds to run. I am trying to profile it as I would like to cut down this time dramatically.

我在一个文件中有一个 Python 脚本，它只需要 30 多秒就可以运行。我正在尝试对其进行概要分析，因为我想大大减少这个时间。

I am trying to profile the script using cProfile, but essentially all it seems to be telling me is that yes, the main script took a long time to run, but doesn't give the kind of breakdown I was expecting. At the terminal, I type something like:

我正在尝试使用来分析脚本cProfile，但基本上它似乎告诉我的是，是的，主脚本需要很长时间才能运行，但没有给出我期望的那种故障。在终端，我输入如下内容：

cat my_script_input.txt | python -m cProfile -s time my_script.py

The results I get are:

我得到的结果是：

<my_script_output>
             683121 function calls (682169 primitive calls) in 32.133 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   31.980   31.980   32.133   32.133 my_script.py:18(<module>)
   121089    0.050    0.000    0.050    0.000 {method 'split' of 'str' objects}
   121090    0.038    0.000    0.049    0.000 fileinput.py:243(next)
        2    0.027    0.014    0.036    0.018 {method 'sort' of 'list' objects}
   121089    0.009    0.000    0.009    0.000 {method 'strip' of 'str' objects}
   201534    0.009    0.000    0.009    0.000 {method 'append' of 'list' objects}
   100858    0.009    0.000    0.009    0.000 my_script.py:51(<lambda>)
      952    0.008    0.000    0.008    0.000 {method 'readlines' of 'file' objects}
 1904/952    0.003    0.000    0.011    0.000 fileinput.py:292(readline)
    14412    0.001    0.000    0.001    0.000 {method 'add' of 'set' objects}
      182    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This doesn't seem to be telling me anything useful. The vast majority of the time is simply listed as:

这似乎没有告诉我任何有用的信息。绝大多数时间都被简单地列为：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   31.980   31.980   32.133   32.133 my_script.py:18(<module>)

In my_script.py, Line 18 is nothing more than the closing """of the file's header block comment, so it's not that there is a whole load of work concentrated in Line 18. The script as a whole is mostly made up of line-based processing with mostly some string splitting, sorting and set work, so I was expecting to find the majority of time going to one or more of these activities. As it stands, seeing all the time grouped in cProfile's results as occurring on a comment line doesn't make any sense or at least does not shed any light on what is actually consuming all the time.

中my_script.py，第18行无非"""就是文件头块注释的关闭，所以并不是说整个负载的工作都集中在第18行。整个脚本大部分是由基于行的处理组成的，大部分是一些字符串拆分、排序和设置工作，所以我希望能找到大部分时间用于这些活动中的一项或多项。就目前而言，将 cProfile 结果中的所有时间分组为出现在注释行上没有任何意义，或者至少没有说明实际一直在消耗什么。

EDIT:I've constructed a minimum working example similar to my above case to demonstrate the same behavior:

编辑：我已经构建了一个与上述案例类似的最小工作示例来演示相同的行为：

mwe.py

文件

import fileinput

for line in fileinput.input():
    for i in range(10):
        y = int(line.strip()) + int(line.strip())

And call it with:

并调用它：

perl -e 'for(1..1000000){print "$_\n"}' | python -m cProfile -s time mwe.py

To get the result:

要得到结果：

         22002536 function calls (22001694 primitive calls) in 9.433 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    8.004    8.004    9.433    9.433 mwe.py:1(<module>)
 20000000    1.021    0.000    1.021    0.000 {method 'strip' of 'str' objects}
  1000001    0.270    0.000    0.301    0.000 fileinput.py:243(next)
  1000000    0.107    0.000    0.107    0.000 {range}
      842    0.024    0.000    0.024    0.000 {method 'readlines' of 'file' objects}
 1684/842    0.007    0.000    0.032    0.000 fileinput.py:292(readline)
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Am I using cProfile incorrectly somehow?

我是否以某种方式错误地使用了 cProfile？

Answer 1

采纳答案by abarnert

As I mentioned in a comment, when you can't get cProfileto work externally, you can often use it internally instead. It's not that hard.

正如我在评论中提到的，当您无法cProfile在外部工作时，您通常可以在内部使用它。这并不难。

For example, when I run with -m cProfilein my Python 2.7, I get effectively the same results you did. But when I manually instrument your example program:

例如，当我-m cProfile在 Python 2.7 中运行时，我实际上得到了与您相同的结果。但是当我手动检测您的示例程序时：

import fileinput
import cProfile

pr = cProfile.Profile()
pr.enable()
for line in fileinput.input():
    for i in range(10):
        y = int(line.strip()) + int(line.strip())
pr.disable()
pr.print_stats(sort='time')

… here's what I get:

......这是我得到的：

         22002533 function calls (22001691 primitive calls) in 3.352 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 20000000    2.326    0.000    2.326    0.000 {method 'strip' of 'str' objects}
  1000001    0.646    0.000    0.700    0.000 fileinput.py:243(next)
  1000000    0.325    0.000    0.325    0.000 {range}
      842    0.042    0.000    0.042    0.000 {method 'readlines' of 'file' objects}
 1684/842    0.013    0.000    0.055    0.000 fileinput.py:292(readline)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

That's a lot more useful: It tells you what you probably already expected, that more than half your time is spent calling str.strip().

这更有用：它告诉你你可能已经预料到的事情，你有一半以上的时间花在调用str.strip().

Also, note that if you can't edit the file containing code you wish to profile (mwe.py), you can always do this:

另外，请注意，如果您无法编辑包含要分析的代码的文件 ( mwe.py)，您可以随时执行以下操作：

import cProfile
pr = cProfile.Profile()
pr.enable()
import mwe
pr.disable()
pr.print_stats(sort='time')

Even that doesn't always work. If your program calls exit(), for example, you'll have to use a try:/finally:wrapper and/or an atexit. And it it calls os._exit(), or segfaults, you're probably completely hosed. But that isn't very common.

即使那样也不总是奏效。exit()例如，如果您的程序调用，则必须使用try:/finally:包装器和/或atexit. 它调用os._exit()，或段错误，您可能已经完全受不了了。但这并不常见。

However, something I discovered later: If you move all code out of the global scope, -m cProfileseems to work, at least for this case. For example:

但是，我后来发现了一些东西：如果您将所有代码移出全局范围，-m cProfile似乎可以工作，至少在这种情况下是这样。例如：

import fileinput

def f():
    for line in fileinput.input():
        for i in range(10):
            y = int(line.strip()) + int(line.strip())
f()

Now the output from -m cProfileincludes, among other things:

现在的输出-m cProfile包括：

2000000 4.819 0.000 4.819 0.000 :0(strip) 100001 0.288 0.000 0.295 0.000 fileinput.py:243(next)

I have no idea why this also made it twice as slow… or maybe that's just a cache effect; it's been a few minutes since I last ran it, and I've done lots of web browsing in between. But that's not important, what's important is that most of the time is getting charged to reasonable places.

我不知道为什么这也让它慢了两倍……或者这可能只是缓存效果；距离我上次运行它已经过去了几分钟，我在这之间做了很多网页浏览。但这并不重要，重要的是大部分时间都在合理的地方收费。

But if I change this to move the outer loop to the global level, and only its body into a function, most of the time disappears again.

但是如果我改变这个，把外循环移到全局层面，只有它的body变成一个函数，大部分时间又消失了。

Another alternative, which I wouldn't suggest except as a last resort…

另一种选择，我不建议除非作为最后的手段......

I notice that if I use profile instead of cProfile, it works both internally and externally, charging time to the right calls. However, those calls are also about 5x slower. And there seems to be an additional 10 seconds of constant overhead (which gets charged to import profileif used internally, whatever's on line 1 if used externally). So, to find out that splitis using 70% of my time, instead of waiting 4 seconds and doing 2.326 / 3.352, I have to wait 27 seconds, and do 10.93 / (26.34 - 10.01). Not much fun…

我注意到，如果我使用 profile 而不是 cProfile，它可以在内部和外部工作，为正确的呼叫收取时间。但是，这些调用也慢了大约 5 倍。并且似乎还有额外的 10 秒恒定开销（import profile如果在内部使用，则将收取费用，如果在外部使用，则第 1 行中的任何内容）。因此，要找出它split占用了我 70% 的时间，而不是等待 4 秒并执行 2.326 / 3.352，我必须等待 27 秒，然后执行 10.93 / (26.34 - 10.01)。没什么好玩的……

One last thing: I get the same results with a CPython 3.4 dev build—correct results when used internally, everything charged to the first line of code when used externally. But PyPy 2.2/2.7.3 and PyPy3 2.1b1/3.2.3 both seem to give me correct results with -m cProfile. This may just mean that PyPy's cProfileis faked on top of profilebecause the pure-Python code is fast enough.

最后一件事：我使用 CPython 3.4 dev build 得到了相同的结果——在内部使用时结果正确，在外部使用时所有内容都记入第一行代码。但是 PyPy 2.2/2.7.3 和 PyPy3 2.1b1/3.2.3 似乎都给了我正确的结果-m cProfile。这可能只是意味着 PyPycProfile是伪造的，profile因为纯 Python 代码足够快。

Anyway, if someone can figure out/explain why -m cProfileisn't working, that would be great… but otherwise, this is usually a perfectly good workaround.

无论如何，如果有人能弄清楚/解释为什么-m cProfile不起作用，那就太好了……但除此之外，这通常是一个非常好的解决方法。

Python 从 cProfile 获得有意义的结果

提问by Bryce Thomas

采纳答案by abarnert

相关推荐

最近更新

标签

Python 从 cProfile 获得有意义的结果

提问by Bryce Thomas

采纳答案by abarnert

相关推荐

Python中的多行注释

Python 使用 PIL 在图像上添加文本

python：扩展的ASCII码

Python 获取 <generator 对象 <genexpr>

相关推荐

最近更新

标签