如果 PyPy 快 6.3 倍,为什么我不应该在 CPython 上使用 PyPy?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18946662/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why shouldn't I use PyPy over CPython if PyPy is 6.3 times faster?
提问by chhantyal
I've been hearing a lot about the PyPyproject. They claim it is 6.3 times faster than the CPythoninterpreter on their site.
我听说了很多关于PyPy项目的信息。他们声称它比他们网站上的CPython解释器快 6.3 倍。
Whenever we talk about dynamic languages like Python, speed is one of the top issues. To solve this, they say PyPy is 6.3 times faster.
每当我们谈论像 Python 这样的动态语言时,速度都是最重要的问题之一。为了解决这个问题,他们说 PyPy 快了 6.3 倍。
The second issue is parallelism, the infamous Global Interpreter Lock(GIL). For this, PyPy says it can give GIL-less Python.
第二个问题是并行性,即臭名昭著的全局解释器锁(GIL)。为此,PyPy 表示它可以提供无 GIL 的 Python。
If PyPy can solve these great challenges, what are its weaknesses that are preventing wider adoption? That is to say, what's preventing someone like me, a typical Python developer, from switching to PyPy right now?
如果 PyPy 可以解决这些巨大的挑战,那么它的哪些弱点阻碍了更广泛的采用?也就是说,是什么阻止了像我这样一个典型的 Python 开发人员现在转向 PyPy ?
采纳答案by Veedrac
NOTE:PyPy is more mature and better supported now than it was in 2013, when this question was asked. Avoid drawing conclusions from out-of-date information.
注意:PyPy 现在比 2013 年被问到这个问题时更加成熟和得到更好的支持。避免从过时的信息中得出结论。
- PyPy, as others have been quick to mention, has tenuous support for C extensions. It hassupport, but typically at slower-than-Python speeds and it's iffy at best. Hence a lot of modules simply requireCPython.
PyPy doesn't support numpyPyPy now supports numpy. Some extensions are still not supported (Pandas, SciPy, etc.), take a look at the list of supported packagesbefore making the change. - Python 3 support
is experimental at the moment.has just reached stable! As of 20th June 2014, PyPy3 2.3.1 - Fulcrum is out! - PyPy sometimes isn'tactually faster for "scripts", which a lot of people use Python for. These are the short-running programs that do something simple and small. Because PyPy is a JIT compiler its main advantages come from long run times and simple types (such as numbers). Frankly, PyPy's pre-JIT speeds are pretty badcompared to CPython.
- Inertia. Moving to PyPy often requires retooling, which for some people and organizations is simply too much work.
- 正如其他人很快提到的那样,PyPy对 C 扩展的支持微弱。它有支持,但通常比 Python 的速度慢,而且充其量是不确定的。因此,许多模块只需要CPython。
PyPy 不支持 numpyPyPy 现在支持 numpy。某些扩展仍然不受支持(Pandas、SciPy 等),请在进行更改之前查看支持的软件包列表。 - Python 3 支持
目前处于试验阶段。刚刚达到稳定!截至 2014 年 6 月 20 日,PyPy3 2.3.1 - Fulcrum 已发布! - PyPy 有时实际上对于“脚本”来说并不快,很多人使用 Python 来处理。这些是短时间运行的程序,可以做一些简单而小的事情。因为 PyPy 是一个 JIT 编译器,它的主要优势来自于运行时间长和简单的类型(例如数字)。坦率地说,与 CPython 相比,PyPy 的 pre-JIT 速度相当糟糕。
- 惯性。迁移到 PyPy 通常需要重新调整工具,这对于某些人和组织来说简直是太多的工作。
Those are the main reasons that affect me, I'd say.
我会说,这些是影响我的主要原因。
回答by Tritium21
Because pypy is not 100% compatible, takes 8 gigs of ram to compile, is a moving target, and highly experimental, where cpython is stable, the default target for module builders for 2 decades (including c extensions that don't work on pypy), and already widely deployed.
因为 pypy 不是 100% 兼容,需要 8 gigs 的 ram 来编译,是一个移动的目标,并且具有高度的实验性,其中 cpython 是稳定的,是 20 年来模块构建器的默认目标(包括在 pypy 上不起作用的 c 扩展) ),并且已经广泛部署。
Pypy will likely never be the reference implementation, but it is a good tool to have.
Pypy 可能永远不会成为参考实现,但它是一个很好的工具。
回答by BrenBarn
The second question is easier to answer: you basically canuse PyPy as a drop-in replacement if all your code is pure Python. However, many widely used libraries (including some of the standard library) are written in C and compiled as Python extensions. Some of these can be made to work with PyPy, some can't. PyPy provides the same "forward-facing" tool as Python --- that is, it is Python --- but its innards are different, so tools that interface with those innards won't work.
第二个问题更容易回答:如果您的所有代码都是纯 Python,您基本上可以使用 PyPy 作为替代品。但是,许多广泛使用的库(包括一些标准库)是用 C 编写的并编译为 Python 扩展。其中一些可以与 PyPy 一起使用,有些则不能。PyPy 提供与 Python 相同的“前向”工具 --- 也就是说,它是 Python --- 但它的内部结构不同,因此与这些内部结构接口的工具将不起作用。
As for the first question, I imagine it is sort of a Catch-22 with the first: PyPy has been evolving rapidly in an effort to improve speed and enhance interoperability with other code. This has made it more experimental than official.
至于第一个问题,我想它有点像第 22 条军规:PyPy 一直在快速发展,以努力提高速度并增强与其他代码的互操作性。这使它比官方更具实验性。
I think it's possible that if PyPy gets into a stable state, it may start getting more widely used. I also think it would be great for Python to move away from its C underpinnings. But it won't happen for a while. PyPy hasn't yet reached the critical mass where it is almostuseful enough on its own to do everything you'd want, which would motivate people to fill in the gaps.
我认为如果 PyPy 进入稳定状态,它可能会开始得到更广泛的使用。我也认为 Python 摆脱它的 C 基础会很棒。但它不会发生一段时间。PyPy 还没有达到临界质量,它本身几乎足够有用,可以做你想做的一切,这会激励人们填补空白。
回答by Eric Urban
I did a small benchmark on this topic. While many of the other posters have made good points about compatibility, my experience has been that PyPy isn't that much faster for just moving around bits. For many uses of Python, it really only exists to translate bits between two or more services. For example, not many web applications are performing CPU intensive analysis of datasets. Instead, they take some bytes from a client, store them in some sort of database, and later return them to other clients. Sometimes the format of the data is changed.
我在这个主题上做了一个小的基准测试。虽然许多其他海报在兼容性方面提出了很好的观点,但我的经验是 PyPy 并没有那么快,只是移动位。对于 Python 的许多用途,它实际上只是在两个或多个服务之间转换位。例如,没有多少 Web 应用程序对数据集执行 CPU 密集型分析。相反,它们从客户端获取一些字节,将它们存储在某种数据库中,然后将它们返回给其他客户端。有时,数据的格式会发生变化。
The BDFL and the CPython developers are a remarkably intelligent group of people and have a managed to help CPython perform excellent in such a scenario. Here's a shameless blog plug: http://www.hydrogen18.com/blog/unpickling-buffers.html. I'm using Stackless, which is derived from CPython and retains the full C module interface. I didn't find any advantage to using PyPy in that case.
BDFL 和 CPython 开发人员是一群非常聪明的人,他们设法帮助 CPython 在这种情况下表现出色。这是一个无耻的博客插件:http: //www.hydrogen18.com/blog/unpickling-buffers.html。我使用的是 Stackless,它派生自 CPython 并保留了完整的 C 模块接口。在这种情况下,我没有发现使用 PyPy 有任何优势。
回答by spookylukey
That site does notclaim PyPy is 6.3 times faster than CPython. To quote:
该站点并未声称 PyPy 比 CPython 快 6.3 倍。报价:
The geometric average of all benchmarks is 0.16 or 6.3 times faster than CPython
所有基准的几何平均值比 CPython 快 0.16 或 6.3 倍
This is a verydifferent statement to the blanket statement you made, and when you understand the difference, you'll understand at least one set of reasons why you can't just say "use PyPy". It might sound like I'm nit-picking, but understanding why these two statements are totally different is vital.
这与您所做的一揽子声明完全不同,当您了解差异时,您将至少了解一组为什么不能只说“使用 PyPy”的原因。这听起来像是我在吹毛求疵,但理解为什么这两个陈述完全不同是至关重要的。
To break that down:
分解一下:
The statement they make only applies to the benchmarks they've used. It says absolutely nothing about your program (unless your program is exactly the same as one of their benchmarks).
The statement is about an averageof a group of benchmarks. There is no claim that running PyPy will give a 6.3 times improvement even for the programs they have tested.
There is no claim that PyPy will even run all the programs that CPython runs at all, let alone faster.
他们所做的声明仅适用于他们使用的基准。它绝对没有说明您的程序(除非您的程序与他们的基准测试之一完全相同)。
该语句是一组基准的平均值。即使对于他们测试过的程序,也没有人声称运行 PyPy 会带来 6.3 倍的改进。
没有人声称 PyPy 甚至可以运行 CPython 运行的所有程序,更不用说更快了。
回答by pts
CPython has reference counting and garbage collection, PyPy has garbage collection only.
CPython 有引用计数和垃圾收集,PyPy 只有垃圾收集。
So objects tend to be deleted earlier and __del__
is called in a more predictable way in CPython. Some software relies on this behavior, thus they are not ready for migrating to PyPy.
所以对象往往更早被删除,并__del__
在 CPython 中以更可预测的方式被调用。一些软件依赖于这种行为,因此它们还没有准备好迁移到 PyPy。
Some other software works with both, but uses less memory with CPython, because unused objects are freed earlier. (I don't have any measurements to indicate how significant this is and what other implementation details affect the memory use.)
其他一些软件可以同时使用两者,但使用 CPython 使用更少的内存,因为未使用的对象更早被释放。(我没有任何测量来表明这有多重要以及哪些其他实现细节会影响内存使用。)
回答by Stephan Eggermont
For a lot of projects, there is actually 0% difference between the different pythons in terms of speed. That is those that are dominated by engineering time and where all pythons have the same amount of library support.
对于很多项目来说,不同的python在速度上实际上有0%的差异。那就是那些以工程时间为主并且所有 python 都具有相同数量库支持的那些。
回答by Stephan Eggermont
Q: If PyPy can solve these great challenges (speed, memory consumption, parallelism) in comparison to CPython, what are its weaknesses that are preventing wider adoption?
问:与 CPython 相比,如果 PyPy 可以解决这些巨大的挑战(速度、内存消耗、并行性),那么它的哪些弱点阻碍了更广泛的采用?
A: First, there is little evidence that the PyPy team can solve the speed problem in general. Long-term evidence is showing that PyPy runs certain Python codes slower than CPython and this drawback seems to be rooted very deeply in PyPy.
A:首先,几乎没有证据表明 PyPy 团队可以解决一般的速度问题。长期证据表明,PyPy 运行某些 Python 代码比 CPython 慢,而这个缺点似乎深深植根于 PyPy。
Secondly, the current version of PyPy consumes much more memory than CPython in a rather large set of cases. So PyPy didn't solve the memory consumption problem yet.
其次,在相当大的一组案例中,当前版本的 PyPy 比 CPython 消耗更多的内存。所以PyPy还没有解决内存消耗问题。
Whether PyPy solves the mentioned great challenges and will in generalbe faster, less memory hungry, and more friendly to parallelism than CPython is an open question that cannot be solved in the short term. Some people are betting that PyPy will never be able to offer a generalsolution enabling it to dominate CPython 2.7 and 3.3 in all cases.
PyPy 是否解决了上述巨大挑战,并且总体上是否会比 CPython 更快、内存占用更少、对并行更友好,这是一个短期内无法解决的悬而未决的问题。有些人认为 PyPy 永远无法提供通用的解决方案,使其在所有情况下都能主导 CPython 2.7 和 3.3。
If PyPy succeeds to be better than CPython in general, which is questionable, the main weakness affecting its wider adoption will be its compatibility with CPython. There also exist issues such as the fact that CPython runs on a wider range of CPUs and OSes, but these issues are much less important compared to PyPy's performance and CPython-compatibility goals.
如果 PyPy 成功地总体上优于 CPython(这是有问题的),那么影响其更广泛采用的主要弱点将是它与 CPython 的兼容性。还存在一些问题,例如 CPython 可以在更广泛的 CPU 和操作系统上运行,但与 PyPy 的性能和 CPython 兼容性目标相比,这些问题的重要性要小得多。
Q: Why can't I do drop in replacement of CPython with PyPy now?
问:为什么我现在不能用 PyPy 替换 CPython?
A: PyPy isn't 100% compatible with CPython because it isn't simulating CPython under the hood. Some programs may still depend on CPython's unique features that are absent in PyPy such as C bindings, C implementations of Python object&methods, or the incremental nature of CPython's garbage collector.
答:PyPy 不是 100% 与 CPython 兼容,因为它没有在幕后模拟 CPython。一些程序可能仍然依赖于 PyPy 中没有的 CPython 的独特功能,例如 C 绑定、Python 对象和方法的 C 实现,或 CPython 垃圾收集器的增量特性。
回答by Yishen Chen
To make this simple: PyPy provides the speed that's lacked by CPython but sacrifices its compatibility. Most people, however, choose Python for its flexibility and its "battery-included" feature (high compatibility), not for its speed (it's still preferred though).
简单起见:PyPy 提供了 CPython 所缺乏的速度,但牺牲了其兼容性。然而,大多数人选择 Python 是因为它的灵活性和“内置电池”特性(高兼容性),而不是它的速度(尽管它仍然是首选)。
回答by lifolofi
I've found examples, where PyPy is slower than Python. But: Only on Windows.
我找到了一些例子,其中 PyPy 比 Python 慢。但是:仅在 Windows 上。
C:\Users\User>python -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 294 msec per loop
C:\Users\User>pypy -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 1.33 sec per loop
So, if you think of PyPy, forget Windows. On Linux, you can achieve awesome accelerations. Example (list all primes between 1 and 1,000,000):
因此,如果您想到 PyPy,请忘记 Windows。在 Linux 上,您可以实现惊人的加速。示例(列出 1 到 1,000,000 之间的所有素数):
from sympy import sieve
primes = list(sieve.primerange(1, 10**6))
This runs 10(!) times faster on PyPy than on Python. But not on windows. There it is only 3x as fast.
这在 PyPy 上的运行速度比在 Python 上快 10(!) 倍。但不是在窗户上。在那里它只有 3 倍的速度。