Python 从 ND 到 1D 阵列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13730468/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 09:29:18  来源:igfitidea点击:

From ND to 1D arrays

pythonnumpy

提问by Amelio Vazquez-Reina

Say I have an array a:

说我有一个数组a

a = np.array([[1,2,3], [4,5,6]])

array([[1, 2, 3],
       [4, 5, 6]])

I would like to convert it to a 1D array (i.e. a column vector):

我想将其转换为一维数组(即列向量):

b = np.reshape(a, (1,np.product(a.shape)))

but this returns

但这会返回

array([[1, 2, 3, 4, 5, 6]])

which is not the same as:

这与以下内容不同:

array([1, 2, 3, 4, 5, 6])

I can take the first element of this array to manually convert it to a 1D array:

我可以将这个数组的第一个元素手动转换为一维数组:

b = np.reshape(a, (1,np.product(a.shape)))[0]

but this requires me to know how many dimensions the original array has (and concatenate [0]'s when working with higher dimensions)

但这需要我知道原始数组有多少维(并在处理更高维时连接 [0])

Is there a dimensions-independent way of getting a column/row vector from an arbitrary ndarray?

是否有从任意 ndarray 获取列/行向量的与维度无关的方法?

采纳答案by unutbu

Use np.ravel(for a 1D view) or np.ndarray.flatten(for a 1D copy) or np.ndarray.flat(for an 1D iterator):

使用np.ravel(对于一维视图)或np.ndarray.flatten(对于一维副本)或np.ndarray.flat(对于一维迭代器):

In [12]: a = np.array([[1,2,3], [4,5,6]])

In [13]: b = a.ravel()

In [14]: b
Out[14]: array([1, 2, 3, 4, 5, 6])

Note that ravel()returns a viewof awhen possible. So modifying balso modifies a. ravel()returns a viewwhen the 1D elements are contiguous in memory, but would return a copyif, for example, awere made from slicing another array using a non-unit step size (e.g. a = x[::2]).

请注意,ravel()返回viewa时候可能。所以修改b也会修改a. 当 1D 元素在内存中是连续ravel()view时返回 a ,但copy如果,例如,a是通过使用非单位步长(例如a = x[::2])对另一个数组进行切片而制成的,则返回 a 。

If you want a copy rather than a view, use

如果您想要副本而不是视图,请使用

In [15]: c = a.flatten()

If you just want an iterator, use np.ndarray.flat:

如果您只想要一个迭代器,请使用np.ndarray.flat

In [20]: d = a.flat

In [21]: d
Out[21]: <numpy.flatiter object at 0x8ec2068>

In [22]: list(d)
Out[22]: [1, 2, 3, 4, 5, 6]

回答by NPE

In [14]: b = np.reshape(a, (np.product(a.shape),))

In [15]: b
Out[15]: array([1, 2, 3, 4, 5, 6])

or, simply:

或者,简单地说:

In [16]: a.flatten()
Out[16]: array([1, 2, 3, 4, 5, 6])

回答by xcellsior

Although this isn't using the np array format, (to lazy to modify my code) this should do what you want... If, you truly want a column vector you will want to transpose the vector result. It all depends on how you are planning to use this.

虽然这不是使用 np 数组格式,(懒得修改我的代码)这应该做你想做的......如果你真的想要一个列向量,你会想要转置向量结果。这完全取决于您打算如何使用它。

def getVector(data_array,col):
    vector = []
    imax = len(data_array)
    for i in range(imax):
        vector.append(data_array[i][col])
    return ( vector )
a = ([1,2,3], [4,5,6])
b = getVector(a,1)
print(b)

Out>[2,5]

So if you need to transpose, you can do something like this:

所以如果你需要转置,你可以这样做:

def transposeArray(data_array):
    # need to test if this is a 1D array 
    # can't do a len(data_array[0]) if it's 1D
    two_d = True
    if isinstance(data_array[0], list):
        dimx = len(data_array[0])
    else:
        dimx = 1
        two_d = False
    dimy = len(data_array)
    # init output transposed array
    data_array_t = [[0 for row in range(dimx)] for col in range(dimy)]
    # fill output transposed array
    for i in range(dimx):
        for j in range(dimy):
            if two_d:
                data_array_t[j][i] = data_array[i][j]
            else:
                data_array_t[j][i] = data_array[j]
    return data_array_t

回答by bikram

For list of array with different size use following:

对于具有不同大小的数组列表,请使用以下内容:

import numpy as np

# ND array list with different size
a = [[1],[2,3,4,5],[6,7,8]]

# stack them
b = np.hstack(a)

print(b)

Output:

输出:

[1 2 3 4 5 6 7 8]

[1 2 3 4 5 6 7 8]

回答by DINA TAKLIT

One of the simplest way is to use flatten(), like this example :

最简单的方法之一是使用flatten(),就像这个例子:

 import numpy as np

 batch_y =train_output.iloc[sample, :]
 batch_y = np.array(batch_y).flatten()

My array it was like this :

我的阵列是这样的:

    0
0   6
1   6
2   5
3   4
4   3
.
.
.

After using flatten():

使用后flatten()

array([6, 6, 5, ..., 5, 3, 6])

It's also the solution of errors of this type :

这也是这种类型错误的解决方案:

Cannot feed value of shape (100, 1) for Tensor 'input/Y:0', which has shape '(?,)' 

回答by haku

I wanted to see a benchmark result of functions mentioned in answers including unutbu's.

我想查看答案中提到的函数的基准测试结果,包括unutbu 的.

Also want to point out that numpy docrecommend to use arr.reshape(-1)in case view is preferable. (even though ravelis tad faster in the following result)

还想指出numpy doc建议使用arr.reshape(-1)以防万一视图更可取。(即使ravel在以下结果中稍微快一点)



TL;DR: np.ravelis the most performant (by very small amount).

TL;DRnp.ravel性能最高(数量很少)。

Benchmark

基准

Functions:

职能:

numpy version: '1.18.0'

numpy 版本:'1.18.0'

Execution times on different ndarraysizes

不同ndarray大小的执行时间

+-------------+----------+-----------+-----------+-------------+
|  function   |   10x10  |  100x100  | 1000x1000 | 10000x10000 |
+-------------+----------+-----------+-----------+-------------+
| ravel       | 0.002073 |  0.002123 |  0.002153 |    0.002077 |
| reshape(-1) | 0.002612 |  0.002635 |  0.002674 |    0.002701 |
| flatten     | 0.000810 |  0.007467 |  0.587538 |  107.321913 |
| flat        | 0.000337 |  0.000255 |  0.000227 |    0.000216 |
+-------------+----------+-----------+-----------+-------------+

Conclusion

结论

raveland reshape(-1)'s execution time was consistent and independent from ndarray size. However, ravelis tad faster, but reshapeprovides flexibility in reshaping size. (maybe that's why numpy docrecommend to use it instead. Or there could be some cases where reshapereturns view and raveldoesn't).
If you are dealing with large size ndarray, using flattencan cause a performance issue. Recommend not to use it. Unless you need a copy of the data to do something else.

ravelandreshape(-1)的执行时间是一致的并且独立于 ndarray 大小。但是,ravel速度稍快,但reshape在重塑大小方面提供了灵活性。(也许这就是为什么numpy doc推荐使用它的原因。或者可能在某些情况下reshape返回视图ravel而不返回视图)。
如果您正在处理大型 ndarray,则使用flatten可能会导致性能问题。建议不要使用。除非您需要数据的副本来做其他事情。

Used code

使用的代码

import timeit
setup = '''
import numpy as np
nd = np.random.randint(10, size=(10, 10))
'''

timeit.timeit('nd = np.reshape(nd, -1)', setup=setup, number=1000)
timeit.timeit('nd = np.ravel(nd)', setup=setup, number=1000)
timeit.timeit('nd = nd.flatten()', setup=setup, number=1000)
timeit.timeit('nd.flat', setup=setup, number=1000)