在 python 中创建列表的最佳和/或最快的方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20816600/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best and/or fastest way to create lists in python
提问by ecampver
In python, as far as I know, there are at least 3 to 4 ways to create and initialize lists of a given size:
在 python 中,据我所知,至少有 3 到 4 种方法来创建和初始化给定大小的列表:
Simple loop with append:
简单循环append:
my_list = []
for i in range(50):
my_list.append(0)
Simple loop with +=:
简单循环+=:
my_list = []
for i in range(50):
my_list += [0]
List comprehension:
列表理解:
my_list = [0 for i in range(50)]
List and integer multiplication:
列表和整数乘法:
my_list = [0] * 50
In these examples I don't think there would be any performance difference given that the lists have only 50 elements, but what if I need a list of a million elements? Would the use of xrangemake any improvement? Which is the preferred/fastest way to create and initialize lists in python?
在这些示例中,鉴于列表只有 50 个元素,我认为不会有任何性能差异,但是如果我需要一百万个元素的列表怎么办?使用xrange会有所改善吗?在 python 中创建和初始化列表的首选/最快方法是什么?
采纳答案by ecampver
Let's run some time tests* with timeit.timeit:
让我们运行一些时间测试* timeit.timeit:
>>> from timeit import timeit
>>>
>>> # Test 1
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list.append(0)
... """
>>> timeit(test)
22.384258893239178
>>>
>>> # Test 2
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list += [0]
... """
>>> timeit(test)
34.494779364416445
>>>
>>> # Test 3
>>> test = "my_list = [0 for i in xrange(50)]"
>>> timeit(test)
9.490926919482774
>>>
>>> # Test 4
>>> test = "my_list = [0] * 50"
>>> timeit(test)
1.5340533503559755
>>>
As you can see above, the last method is the fastest by far.
正如你在上面看到的,最后一种方法是迄今为止最快的。
However, it should onlybe used with immutable items (such as integers). This is because it will create a list with references to the same item.
但是,它应该只用于不可变项(例如整数)。这是因为它将创建一个引用相同项目的列表。
Below is a demonstration:
下面是一个演示:
>>> lst = [[]] * 3
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are the same
>>> id(lst[0])
28734408
>>> id(lst[1])
28734408
>>> id(lst[2])
28734408
>>>
This behavior is very often undesirable and can lead to bugs in the code.
这种行为通常是不可取的,并可能导致代码中的错误。
If you have mutable items (such as lists), then you should use the still very fast list comprehension:
如果您有可变项目(例如列表),那么您应该使用仍然非常快的列表推导式:
>>> lst = [[] for _ in xrange(3)]
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are different
>>> id(lst[0])
28796688
>>> id(lst[1])
28796648
>>> id(lst[2])
28736168
>>>
*Note: In all of the tests, I replaced rangewith xrange. Since the latter returns an iterator, it should always be faster than the former.
*注意:在所有测试中,我都替换range为xrange. 由于后者返回一个迭代器,它应该总是比前者快。
回答by elyase
If you want to see the dependency with the length of the list n:
如果要查看列表长度的依赖项n:
Pure python
纯蟒蛇


I tested for list length up to n=10000 and the behavior remains the same. So the integer multiplication method is the fastest with difference.
我测试了最多 n=10000 的列表长度,并且行为保持不变。所以整数乘法是最快的有差异的。
Numpy
麻木
For lists with more than ~300 elements you should consider numpy.
对于包含超过 300 个元素的列表,您应该考虑numpy。


Benchmark code:
基准代码:
import time
def timeit(f):
def timed(*args, **kwargs):
start = time.clock()
for _ in range(100):
f(*args, **kwargs)
end = time.clock()
return end - start
return timed
@timeit
def append_loop(n):
"""Simple loop with append"""
my_list = []
for i in xrange(n):
my_list.append(0)
@timeit
def add_loop(n):
"""Simple loop with +="""
my_list = []
for i in xrange(n):
my_list += [0]
@timeit
def list_comprehension(n):
"""List comprehension"""
my_list = [0 for i in xrange(n)]
@timeit
def integer_multiplication(n):
"""List and integer multiplication"""
my_list = [0] * n
import numpy as np
@timeit
def numpy_array(n):
my_list = np.zeros(n)
import pandas as pd
df = pd.DataFrame([(integer_multiplication(n), numpy_array(n)) for n in range(1000)],
columns=['Integer multiplication', 'Numpy array'])
df.plot()
Gist here.
要点在这里。
回答by Synedraacus
There is one more method which, while sounding weird, is handy in right curcumstances. If you need to produce the same list many times (initializing matrix for roguelike pathfinding and related stuffin my case), you can store a copy of the list in the tuple, then turn it to list when you need it. It is noticeably quicker than generating list viacomprehensions and, unlike list multiplication, works with nested data structures.
还有一种方法,虽然听起来很奇怪,但在正确的姜黄中很方便。如果您需要多次生成相同的列表(在我的情况下为roguelike 寻路和相关内容初始化矩阵),您可以将列表的副本存储在元组中,然后在需要时将其转换为列表。它明显比通过推导式生成列表快,并且与列表乘法不同,它适用于嵌套数据结构。
# In class definition
def __init__(self):
self.l = [[1000 for x in range(1000)] for y in range(1000)]
self.t = tuple(self.l)
def some_method(self):
self.l = list(self.t)
self._do_fancy_computation()
# self.l is changed by this method
# Later in code:
for a in range(10):
obj.some_method()
Voila, on every iteration you have a fresh copy of the same list in no time!
瞧,在每次迭代中,您都会立即获得相同列表的新副本!
Disclaimer:
免责声明:
I do not have a slightest idea why is this so quick or whether it works anywhere outside CPython 3.4.
我不知道为什么这么快,或者它是否可以在 CPython 3.4 之外的任何地方工作。
回答by Idrisi_Kasim
If you want to create list incremeting i.e adding 1 every time use range fuction within the list argument in range argument start value is included and the end value is excluded as shown below
如果您想创建列表增量,即每次使用范围函数内的列表参数中的范围参数开始值添加 1 并排除结束值,如下所示
list(range(10,20))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
If you want to create list everytime adding 2 to previuos element use this, here the third value in range fuction is steps to be taken
如果你想在每次将 2 添加到 previuos 元素时创建列表使用这个,这里范围函数中的第三个值是要采取的步骤
list(range(10,20,2))
[10, 12, 14, 16, 18]
Now you can give any start element, end element and steps and create many list more faster and easily.
现在,您可以提供任何开始元素、结束元素和步骤,并更快、更轻松地创建许多列表。
Thankyou..!
谢谢..!
Happy Learning...:)
快乐学习...:)

