Python 展平 NumPy 数组列表？

Question

提问by Jerry Zhang

It appears that I have data in the format of a list of NumPy arrays (type() = np.ndarray):

看来我有 NumPy 数组 ( type() = np.ndarray)列表格式的数据：

[array([[ 0.00353654]]), array([[ 0.00353654]]), array([[ 0.00353654]]), 
array([[ 0.00353654]]), array([[ 0.00353654]]), array([[ 0.00353654]]), 
array([[ 0.00353654]]), array([[ 0.00353654]]), array([[ 0.00353654]]), 
array([[ 0.00353654]]), array([[ 0.00353654]]), array([[ 0.00353654]]),
array([[ 0.00353654]])]

I am trying to put this into a polyfit function:

我正在尝试将其放入 polyfit 函数中：

m1 = np.polyfit(x, y, deg=2)

However, it returns the error: TypeError: expected 1D vector for x

但是，它返回错误： TypeError: expected 1D vector for x

I assume I need to flatten my data into something like:

我假设我需要将我的数据展平成类似的东西：

[0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654 ...]

I have tried a list comprehension which usually works on lists of lists, but this as expected has not worked:

我尝试了一个列表理解，它通常适用于列表列表，但正如预期的那样没有奏效：

[val for sublist in risks for val in sublist]

What would be the best way to do this?

什么是最好的方法来做到这一点？

Answer 1

采纳答案by Divakar

You could use numpy.concatenate, which as the name suggests, basically concatenates all the elements of such an input list into a single NumPy array, like so -

您可以使用numpy.concatenate，顾名思义，它基本上将此类输入列表的所有元素连接到一个 NumPy 数组中，如下所示 -

import numpy as np
out = np.concatenate(input_list).ravel()

If you wish the final output to be a list, you can extend the solution, like so -

如果您希望最终输出是一个列表，您可以扩展解决方案，就像这样 -

out = np.concatenate(input_list).ravel().tolist()

Sample run -

样品运行 -

In [24]: input_list
Out[24]: 
[array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]])]

In [25]: np.concatenate(input_list).ravel()
Out[25]: 
array([ 0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654,  0.00353654])

Convert to list -

转换为列表 -

In [26]: np.concatenate(input_list).ravel().tolist()
Out[26]: 
[0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654]

Answer 2

回答by zsatter14

I came across this same issue and found a solution that combines 1-D numpy arrays of variable length:

我遇到了同样的问题，并找到了一个结合了可变长度的一维 numpy 数组的解决方案：

np.column_stack(input_list).ravel()

See numpy.column_stackfor more info.

有关更多信息，请参阅numpy.column_stack。

Example with variable-length arrays with your example data:

带有示例数据的可变长度数组示例：

In [135]: input_list
Out[135]: 
[array([[ 0.00353654,  0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654,  0.00353654,  0.00353654]])]

In [136]: [i.size for i in input_list]    # variable size arrays
Out[136]: [2, 1, 1, 3]

In [137]: np.column_stack(input_list).ravel()
Out[137]: 
array([ 0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654])

Note: Only tested on Python 2.7.12

注意：仅在 Python 2.7.12 上测试

Answer 3

回答by ayorgo

Can also be done by

也可以通过

np.array(list_of_arrays).flatten().tolist()

resulting in

导致

[0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654]

Update

更新

As @aydow points out in the comments, using numpy.ndarray.ravelcan be faster if one doesn't care about getting a copy or a view

正如@aydow 在评论中指出的那样，numpy.ndarray.ravel如果不关心获取副本或视图，使用速度会更快

np.array(list_of_arrays).ravel()

Although, according to docs

虽然，根据文档

When a view is desired in as many cases as possible, arr.reshape(-1)may be preferable.

当在尽可能多的情况下需要视图时，arr.reshape(-1)可能更可取。

In other words

换句话说

np.array(list_of_arrays).reshape(-1)

The initial suggestionof mine was to use numpy.ndarray.flattenthat returns a copy every timewhich affects performance.

在最初的建议我的是使用numpy.ndarray.flatten的是回报的副本，每次会影响性能。

Let's now see how the time complexityof the above-listed solutions compares using perfplotpackage for a setup similar to the one of the OP

现在让我们看看上面列出的解决方案的时间复杂度如何比较使用perfplot包进行类似于 OP 之一的设置

import perfplot

perfplot.show(
    setup=lambda n: np.random.rand(n, 2),
    kernels=[lambda a: a.ravel(),
             lambda a: a.flatten(),
             lambda a: a.reshape(-1)],
    labels=['ravel', 'flatten', 'reshape'],
    n_range=[2**k for k in range(16)],
    xlabel='N')

Here flattendemonstrates piecewise linear complexity which can be reasonably explained by it making a copy of the initial array compare to constant complexities of raveland reshapethat return a view.

这里flatten演示了分段线性复杂度，可以通过将初始数组的副本ravel与reshape返回视图的常量复杂度进行比较来合理解释。

It's also worth noting that, quite predictably, converting the outputs .tolist()evens out the performance of all three to equally linear.

还值得注意的是，可以预见的是，将输出转换.tolist()为所有三个的性能均等线性。

Answer 4

回答by kmario23

Another simple approach would be to use numpy.hstack()followed by removing the singleton dimension using squeeze()as in:

另一种简单的方法是使用，numpy.hstack()然后使用如下方式删除单例维度squeeze()：

In [61]: np.hstack(list_of_arrs).squeeze()
Out[61]: 
array([0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
       0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
       0.00353654, 0.00353654, 0.00353654])

Answer 5

回答by Tim Skov Jacobsen

Another way using itertoolsfor flattening the array:

itertools用于展平数组的另一种方法：

import itertools

# Recreating array from question
a = [np.array([[0.00353654]])] * 13

# Make an iterator to yield items of the flattened list and create a list from that iterator
flattened = list(itertools.chain.from_iterable(a))

This solution should be very fast, see https://stackoverflow.com/a/408281/5993892for more explanation.

此解决方案应该非常快，请参阅https://stackoverflow.com/a/408281/5993892了解更多说明。

If the resulting data structure should be a numpyarray instead, use numpy.fromiter()to exhaust the iterator into an array:

如果结果数据结构应该是一个numpy数组，请使用numpy.fromiter()将迭代器耗尽到一个数组中：

# Make an iterator to yield items of the flattened list and create a numpy array from that iterator
flattened_array = np.fromiter(itertools.chain.from_iterable(a), float)

Docs for itertools.chain.from_iterable():https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable

文档itertools.chain.from_iterable()：https : //docs.python.org/3/library/itertools.html#itertools.chain.from_iterable

Docs for numpy.fromiter():https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html

文档numpy.fromiter()：https : //docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html

Python 展平 NumPy 数组列表？

提问by Jerry Zhang

采纳答案by Divakar

回答by zsatter14

回答by ayorgo

回答by kmario23

回答by Tim Skov Jacobsen

相关推荐

最近更新

标签

Python 展平 NumPy 数组列表？

提问by Jerry Zhang

采纳答案by Divakar

回答by zsatter14

回答by ayorgo

回答by kmario23

回答by Tim Skov Jacobsen

相关推荐

Python Tensorflow One 热编码器？

正则表达式在字符串中查找最后一个单词（Python）

Python 将列表输入 TensorFlow 中的 feed_dict 时发出问题

是否有没有值的 Python 字典？

相关推荐

最近更新

标签