Python What is the preferred way to preallocate NumPy arrays?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3491802/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the preferred way to preallocate NumPy arrays?
提问by kim busyn
I am new to NumPy/SciPy. From the documentation, it seems more efficient to preallocate a single array rather than call append/insert/concatenate.
I am new to NumPy/SciPy. From the documentation, it seems more efficient to preallocate a single array rather than call append/insert/concatenate.
For example, to add a column of 1's to an array, i think that this:
For example, to add a column of 1's to an array, i think that this:
ar0 = np.linspace(10, 20, 16).reshape(4, 4)
ar0[:,-1] = np.ones_like(ar0[:,0])
is preferred to this:
is preferred to this:
ar0 = np.linspace(10, 20, 12).reshape(4, 3)
ar0 = np.insert(ar0, ar0.shape[1], np.ones_like(ar0[:,0]), axis=1)
my first question is whether this is correct (that the first is better), and my second question is, at the moment, I am just preallocating my arrays like this (which I noticed in several of the Cookbook examples on the SciPy Site):
my first question is whether this is correct (that the first is better), and my second question is, at the moment, I am just preallocating my arrays like this (which I noticed in several of the Cookbook examples on the SciPy Site):
np.zeros((8,5))
what is the 'NumPy-preferred' way to do this?
what is the 'NumPy-preferred' way to do this?
采纳答案by unutbu
Preallocation mallocs all the memory you need in one call, while resizing the array (through calls to append,insert,concatenate or resize) may require copying the array to a larger block of memory. So you are correct, preallocation is preferred over (and should be faster than) resizing.
Preallocation mallocs all the memory you need in one call, while resizing the array (through calls to append,insert,concatenate or resize) may require copying the array to a larger block of memory. So you are correct, preallocation is preferred over (and should be faster than) resizing.
There are a number of "preferred" ways to preallocate numpy arrays depending on what you want to create. There is np.zeros, np.ones, np.empty, np.zeros_like, np.ones_like, and np.empty_like, and many others that create useful arrays such as np.linspace, and np.arange.
There are a number of "preferred" ways to preallocate numpy arrays depending on what you want to create. There is np.zeros, np.ones, np.empty, np.zeros_like, np.ones_like, and np.empty_like, and many others that create useful arrays such as np.linspace, and np.arange.
So
So
ar0 = np.linspace(10, 20, 16).reshape(4, 4)
is just fine if this comes closest to the ar0you desire.
is just fine if this comes closest to the ar0you desire.
However, to make the last column all 1's, I think the preferred way would be to just say
However, to make the last column all 1's, I think the preferred way would be to just say
ar0[:,-1]=1
Since the shape of ar0[:,-1]is (4,), the 1 is broadcastedto match this shape.
Since the shape of ar0[:,-1]is (4,), the 1 is broadcastedto match this shape.
回答by Justas
In cases where performance is important, np.emptyand np.zerosappear to be the fastest ways to initialize numpy arrays.
In cases where performance is important, np.emptyand np.zerosappear to be the fastest ways to initialize numpy arrays.
Below are test results for each method and a few others. Values are in seconds.
Below are test results for each method and a few others. Values are in seconds.
>>> timeit("np.empty(1000000)",number=1000, globals=globals())
0.033749611208094166
>>> timeit("np.zeros(1000000)",number=1000, globals=globals())
0.03421245135849915
>>> timeit("np.arange(0,1000000,1)",number=1000, globals=globals())
1.2212416112155324
>>> timeit("np.ones(1000000)",number=1000, globals=globals())
2.2877375495381145
>>> timeit("np.linspace(0,1000000,1000000)",number=1000, globals=globals())
3.0824269766860652

