python list(...).insert(...) 的性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1110332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 21:30:11  来源:igfitidea点击:

Performance of list(...).insert(...)

pythonarchitecturememorylistmemcpy

提问by ilya n.

I thought about the following question about computer's architecture. Suppose I do in Python

我想到了以下有关计算机体系结构的问题。假设我用 Python 做

from bisect import bisect
index = bisect(x, a)      # O(log n)  (also, shouldn't it be a standard list function?)
x.insert(index, a)        # O(1) + memcpy()

which takes log n, plus, if I correctly understand it, a memory copy operation for x[index:]. Now I read recently that the bottleneck is usually in the communication between processor and the memory so the memory copy couldbe done by RAM quite fast. Is it how that works?

这需要log n,另外,如果我正确理解它,x[index:]. 现在我最近读到瓶颈通常在处理器和内存之间的通信中,因此内存复制可以通过 RAM 非常快地完成。它是如何工作的?

回答by Stephan202

Python is a language. Multiple implementations exist, and they mayhave different implementations for lists. So, without looking at the code of an actual implementation, you cannot know for sure how lists are implemented and how they behave under certain circumstances.

Python是一种语言。存在多种实现,它们可能对列表有不同的实现。因此,如果不查看实际实现的代码,您就无法确定列表是如何实现的以及它们在某些情况下的行为方式。

My bet would be that the references to the objects in a list are stored in contiguous memory (certainly not as a linked list...). If that is indeed so, then insertion using x.insertwill cause all elements behind the inserted element to be moved. This may be done efficiently by the hardware, but the complexity would still be O(n).

我敢打赌,对列表中对象的引用存储在连续内存中(当然不是作为链表......)。如果确实如此,则插入 usingx.insert将导致插入元素后面的所有元素都被移动。这可以由硬件有效地完成,但复杂度仍然是O(n)

For small lists the bisectoperation may take more time than x.insert, even though the former is O(log n)while the latter is O(n). For long lists, however, I'd hazard a guess that x.insertis the bottleneck. In such cases you must consider using a different data structure.

对于小列表,bisect操作可能需要比 更长的时间x.insert,即使前者是O(log n)而后者是O(n)。然而,对于长列表,我敢猜测这x.insert是瓶颈。在这种情况下,您必须考虑使用不同的数据结构。

回答by Seun Osewa

Use the blist moduleif you need a list with better insert performance.

如果您需要具有更好插入性能的列表,请使用blist 模块

回答by Ants Aasma

CPython lists are contiguous arrays. Which one of the O(log n) bisect and O(n) insert dominates your performance profile depends on the size of your list and also the constant factors inside the O(). Particularly, the comparison function invoked by bisect can be something expensive depending on the type of objects in the list.

CPython 列表是连续数组。O(log n) 平分和 O(n) 插入中的哪一个在您的性能配置文件中占主导地位取决于您的列表的大小以及 O() 中的常数因素。特别是,bisect 调用的比较函数可能会很昂贵,具体取决于列表中对象的类型。

If you need to hold potentially large mutable sorted sequences then the linear array underlying Pythons list type isn't a good choice. Depending on your requirements heaps, trees or skip-lists might be appropriate.

如果您需要保存潜在的大型可变排序序列,那么基于 Python 列表类型的线性数组不是一个好的选择。根据您的要求,堆​​、树或跳过列表可能是合适的。