为什么我在 C# 中的计算比 Python 快得多
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29903320/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is my computation so much faster in C# than Python
提问by ssd
Below is a simple piece of process coded in C#and Pythonrespectively (for those of you curious about the process, it's the solution for Problem No. 5 of Project Euler).
下面是一个简单的过程C#,Python分别用和编码(对于那些对过程感到好奇的人,它是Project EulerNo. 5 的解决方案)。
My question is, the C#code below takes only 9 seconds to iterate, while completion of Pythoncode takes 283 seconds (to be exact, 283 seconds on Python 3.4.3 - 64 bits and 329 seconds on Python 2.7.9 - 32 bits).
我的问题是,C#下面的代码只需要 9 秒来迭代,而Python代码完成需要 283 秒(准确地说,Python 3.4.3 - 64 位为 283 秒,Python 2.7.9 - 32 位为 329 秒)。
So far, I've coded similar processes both in C#and Pythonand the execution time differences were comparable. This time however, there is an extreme difference between the elapsed times.
到目前为止,我已经在C#和 中编写了类似的过程,Python并且执行时间差异具有可比性。然而,这一次,经过的时间之间存在极大的差异。
I think, some part of this difference arise from the flexible variable type of python language (I suspect, python converts some part of variables into double) but this much is still hard to explain.
我认为,这种差异的一部分源于python语言的灵活变量类型(我怀疑,python将部分变量转换为double),但这仍然难以解释。
What am I doing wrong?
我究竟做错了什么?
My system: Windows-7 64 bits,
我的系统:Windows-7 64 位,
C# - VS Express 2012 (9 seconds)
C# - VS Express 2012(9 秒)
Python 3.4.3 64 bits (283 seconds)
Python 3.4.3 64 位(283 秒)
Python 2.7.9 32 bits (329 seconds)
Python 2.7.9 32 位(329 秒)
c-sharp code:
c-sharp代码:
using System;
namespace bug_vcs {
class Program {
public static void Main(string[] args) {
DateTime t0 = DateTime.Now;
int maxNumber = 20;
bool found = false;
long start = maxNumber;
while (!found) {
found = true;
int i = 2;
while ((i < maxNumber + 1) && found) {
if (start % i != 0) {
found = false;
}
i++;
}
start++;
}
Console.WriteLine("{0:d}", start - 1);
Console.WriteLine("time elapsed = {0:f} sec.", (DateTime.Now - t0).Seconds);
Console.ReadLine();
}
}
}
and python code:
和python代码:
from datetime import datetime
t0 = datetime.now()
max_number = 20
found = False
start = max_number
while not found:
found = True
i = 2
while ((i < max_number + 1) and found):
if (start % i) != 0:
found = False
i += 1
start += 1
print("number {0:d}\n".format(start - 1))
print("time elapsed = {0:f} sec.\n".format((datetime.now() - t0).seconds))
采纳答案by Blixt
The answer is simply that Python deals with objects for everything and that it doesn't have JITby default. So rather than being very efficient by modifying a few bytes on the stack and optimizing the hot parts of the code (i.e., the iteration) –?Python chugs along with rich objects representing numbers and no on-the-fly optimizations.
答案很简单,Python 处理所有对象的对象,并且默认情况下它没有JIT。因此,与其通过修改堆栈上的几个字节和优化代码的热点部分(即迭代)来提高效率,不如说 Python 与代表数字的丰富对象一起突飞猛进,并且没有即时优化。
If you tried this in a variant of Python that has JIT (for example, PyPy) I guarantee you that you'll see a massive difference.
如果您在具有 JIT 的 Python 变体(例如 PyPy)中尝试过此操作,我向您保证您会看到巨大的差异。
A general tip is to avoid standard Python for very computationally expensive operations (especially if this is for a backend serving requests from multiple clients). Java, C#, JavaScript, etc. with JIT are incomparably more efficient.
一般提示是避免使用标准 Python 进行计算量非常大的操作(尤其是当后端服务来自多个客户端的请求时)。Java、C#、JavaScript 等使用 JIT 效率更高。
By the way, if you want to write your example in a more Pythonic manner, you could do it like this:
顺便说一句,如果你想以更 Pythonic 的方式编写你的例子,你可以这样做:
from datetime import datetime
start_time = datetime.now()
max_number = 20
x = max_number
while True:
i = 2
while i <= max_number:
if x % i: break
i += 1
else:
# x was not divisible by 2...20
break
x += 1
print('number: %d' % x)
print('time elapsed: %d seconds' % (datetime.now() - start_time).seconds)
The above executed in 90 seconds for me. The reason it's faster relies on seemingly stupid things like xbeing shorter than start, that I'm not assigning variables as often, and that I'm relying on Python's own control structures rather than variable checking to jump in/out of loops.
以上对我来说在 90 秒内执行完毕。它更快的原因依赖于看似愚蠢的事情,比如x比 短start,我不经常分配变量,而且我依赖 Python 自己的控制结构而不是变量检查来跳入/跳出循环。
回答by Mark
Try python JIT Implementations like pypy and numba or cython if you want fast as C but sacrifice a bit of code readability.
如果您想要像 C 一样快但牺牲一点代码可读性,请尝试使用 python JIT 实现,例如 pypy 和 numba 或 cython。
e.g in pypy
例如在pypy
# PyPy
number 232792560
time elapsed = 4.000000 sec.
e.g in cython
例如在 cython
# Cython
number 232792560
time elapsed = 1.000000 sec.
Cython Source:
Cython 来源:
from datetime import datetime
cpdef void run():
t0 = datetime.now()
cdef int max_number = 20
found = False
cdef int start = max_number
cdef int i
while not found:
found = True
i = 2
while ((i < max_number + 1) and found):
if (start % i) != 0:
found = False
i += 1
start += 1
print("number {0:d}\n".format(start - 1))
print("time elapsed = {0:f} sec.\n".format((datetime.now() - t0).seconds))
回答by Solonotix
TL;DR: Long-winded post that is me trying to defend Python (my language of choice) against C#. In this example, C# performs better, but still takes more lines of code to do the same amount of work, but the final performance benefit is that C# is ~5x faster than a similar approach in Python when coded correctly. The end result is that you should use the language that suits you.
TL;DR:冗长的帖子是我试图捍卫 Python(我选择的语言)对抗 C#。在这个例子中,C# 性能更好,但仍然需要更多的代码行来完成相同的工作量,但最终的性能优势是,当正确编码时,C# 比 Python 中的类似方法快约 5 倍。最终结果是您应该使用适合您的语言。
When I run the C# example, it took about 3 seconds to complete on my machine, and gave me a result of 232,792,560. It could be optimized using the known fact that you can only have a number divisible by numbers from 1 to 20 if the number is a multiple of 20, and therefore you don't need to increment by 1, but instead 20. That single optimization made the code execute ~10x faster in a mere 353 milliseconds.
当我运行 C# 示例时,在我的机器上大约需要 3 秒才能完成,结果为 232,792,560。可以使用已知事实进行优化,即如果数字是 20 的倍数,则只能有一个可以被 1 到 20 的数字整除的数字,因此您不需要增加 1,而是增加 20。单个优化使代码在短短 353 毫秒内执行速度提高了约 10 倍。
When I run the Python example, I gave up on waiting and tried to write my own version using itertools, which didn't have much better success, and was taking about as long as your example. Then I hit upon an acceptable version of itertools, if I take into account that only multiples of my largest number could be divisible by all numbers from smallest to largest. As such, the refined Python(3.6) code is here with a decorator timing function that prints the number of seconds it took to execute:
当我运行 Python 示例时,我放弃了等待并尝试使用 itertools 编写我自己的版本,但没有取得更好的成功,并且花费的时间与您的示例一样长。然后我找到了一个可接受的 itertools 版本,如果我考虑到只有我最大数的倍数才能被从最小到最大的所有数字整除。因此,改进的 Python(3.6) 代码在这里带有一个装饰器计时函数,用于打印执行所需的秒数:
import time
from itertools import count, filterfalse
def timer(func):
def wrapper(*args, **kwargs):
start = time.time()
res = func(*args, **kwargs)
print(time.time() - start)
return res
return wrapper
@timer
def test(stop):
return next(filterfalse(lambda x: any(x%i for i in range(2, stop)), count(stop, stop)))
print("Test Function")
print(test(20))
# 11.526668787002563
# 232792560
This also reminded me of a question I recently had to answer on CodeFights for Least Common Multiple using the Greatest Common Denominator function in Python. That code is as follows:
这也让我想起了我最近不得不在使用 Python 中的最大公分母函数的 CodeFights for Least Common Multiple 上回答的一个问题。该代码如下:
import time
from fractions import gcd
from functools import reduce
def timer(func):
def wrapper(*args, **kwargs):
start = time.time()
res = func(*args, **kwargs)
print(time.time() - start)
return res
return wrapper
@timer
def leastCommonDenominator(denominators):
return reduce(lambda a, b: a * b // gcd(a, b), denominators)
print("LCM Function")
print(leastCommonDenominator(range(1, 21)))
# 0.001001596450805664
# 232792560
As in most programming tasks, sometimes the simplest approach isn't always the fastest. Unfortunately, it really stuck out when attempted in Python this time. That said, the beauty in Python is the simplicity of getting a performant execution, where it took 10 lines of C#, I was able to return the correct answer in (potentially) a one-line lambda expression, and 300-times faster than my simple optimization on C#. I'm no specialist in C#, but implementing the same approach here is the code I used and its result (about 5x faster than Python):
与大多数编程任务一样,有时最简单的方法并不总是最快的。不幸的是,这次在 Python 中尝试时它真的很突出。也就是说,Python 的美妙之处在于获得高性能执行的简单性,它用了 10 行 C#,我能够在(可能)一行 lambda 表达式中返回正确答案,并且比我的快 300 倍C# 上的简单优化。我不是 C# 专家,但在这里实现相同的方法是我使用的代码及其结果(比 Python 快约 5 倍):
using System;
using System.Diagnostics;
namespace ConsoleApp1
{
class Program
{
public static void Main(string[] args)
{
Stopwatch t0 = new Stopwatch();
int maxNumber = 20;
long start;
t0.Start();
start = Orig(maxNumber);
t0.Stop();
Console.WriteLine("Original | {0:d}, {1:d}", maxNumber, start);
// Original | 20, 232792560
Console.WriteLine("Original | time elapsed = {0}.", t0.Elapsed);
// Original | time elapsed = 00:00:02.0585575
t0.Restart();
start = Test(maxNumber);
t0.Stop();
Console.WriteLine("Test | {0:d}, {1:d}", maxNumber, start);
// Test | 20, 232792560
Console.WriteLine("Test | time elapsed = {0}.", t0.Elapsed);
// Test | time elapsed = 00:00:00.0002763
Console.ReadLine();
}
public static long Orig(int maxNumber)
{
bool found = false;
long start = 0;
while (!found)
{
start += maxNumber;
found = true;
for (int i=2; i < 21; i++)
{
if (start % i != 0)
found = false;
}
}
return start;
}
public static long Test(int maxNumber)
{
long result = 1;
for (long i = 2; i <= maxNumber; i++)
{
result = (result * i) / GCD(result, i);
}
return result;
}
public static long GCD(long a, long b)
{
while (b != 0)
{
long c = b;
b = a % b;
a = c;
}
return a;
}
}
}
For most higher-level tasks, however, I usually see Python doing exceptionally well in comparison to a .NET implementation, though I cannot substantiate the claims at this time, aside from saying the Python Requests library has given me as much as a double to triple return in performance compared to a C# WebRequest written the same way. This was also true when writing Selenium processes, as I could read text elements in Python in 100 milliseconds or less, but each element retrieval took C# >1 second to return. That said, I actually prefer the C# implementation because of its object-oriented approach, where Python's Selenium implementation goes functional which gets very hard to read at times.
然而,对于大多数更高级别的任务,我通常看到 Python 与 .NET 实现相比表现得非常好,尽管我目前无法证实这些说法,除了说 Python Requests 库给了我两倍之多与以相同方式编写的 C# WebRequest 相比,性能有三倍的回报。编写 Selenium 进程时也是如此,因为我可以在 100 毫秒或更短的时间内读取 Python 中的文本元素,但每次检索元素都需要 C# > 1 秒才能返回。也就是说,我实际上更喜欢 C# 实现,因为它是面向对象的方法,其中 Python 的 Selenium 实现功能强大,有时很难阅读。
回答by Ralph B.
Python (and all scripting languages including matlab) is not intended to be directedly used for large-scale numerical calculation. To have a compatible result as complied programs, avoid the loops at all cost and convert the formula to matrix formats (that needs a little mathematical understanding and skill), so that we can push as much as possible to the background C libraryprovided by numpy, scipy, etc.
Python(以及包括 matlab 在内的所有脚本语言)并非旨在直接用于大规模数值计算。编译后的程序要兼容结果,不惜一切代价避免循环,将公式转换为矩阵格式(需要一点数学理解和技巧),这样我们就可以尽可能地推送到numpy提供的后台C库,scipy 等。
Again, DO NOT write loops for numerical calculation in python, whenever a matrix equivalent possible!
再次强调,不要在 python 中编写用于数值计算的循环,只要有可能等效的矩阵!
回答by Dariusz Knocinski
First of all you need to change the algorithm to solve this problem:
首先你需要改变算法来解决这个问题:
#!/usr/bin/env python
import sys
from timeit import default_timer as timer
pyver = sys.version_info;
print(">")
print("> Smallest multiple of 2 ... K");
print(">")
print("> Python version, interpreter version: {0}.{1}.{2}-{3}-{4}".format(
pyver.major, pyver.minor, pyver.micro, pyver.releaselevel, pyver.serial))
print(">")
K = 20;
print(" K = {0:d}".format(K))
print("")
t0 = timer()
N = K
NP1 = N + 1
N2 = (N >> 1) + 1
vec = range(0, NP1)
smalestMultiple = 1
for i in range(2, N2):
divider = vec[i]
if divider == 1:
continue
for j in range(i << 1, NP1, i):
if (vec[j] % divider) == 0:
vec[j] /= divider
for i in range(2, NP1):
if vec[i] != 1:
smalestMultiple = smalestMultiple * vec[i]
t1 = timer()
print(" smalest multiple = {0:d}".format(smalestMultiple))
print(" time elapsed = {0:f} sec.".format(t1 - t0))
Otput on Linux/Fedora 28/Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz:
Linux/Fedora 28/Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz 上的输出:
> Smallest multiple of 2 ... K
>
> Python version, interpreter version: 2.7.15-final-0
>
> K = 20
>
> smalest multiple = 232792560
> time elapsed = 0.000032 sec.

