Python 如何在 NumPy 中堆叠不同长度的向量？

Question

提问by mac389

How do I stack column-wise nvectors of shape (x,)where x could be any number?

如何堆叠x 可以是任意数字n的形状的列向量(x,)？

For example,

例如，

from numpy import *
a = ones((3,))
b = ones((2,))

c = vstack((a,b)) # <-- gives an error
c = vstack((a[:,newaxis],b[:,newaxis])) #<-- also gives an error

hstackworks fine but concatenates along the wrong dimension.

hstack工作正常，但沿错误的维度连接。

Answer 1

采纳答案by Fred Foo

Short answer: you can't. NumPy does not support jagged arrays natively.

简短的回答：你不能。NumPy 本身不支持锯齿状数组。

Long answer:

长答案：

>>> a = ones((3,))
>>> b = ones((2,))
>>> c = array([a, b])
>>> c
array([[ 1.  1.  1.], [ 1.  1.]], dtype=object)

gives an array that may or may notbehave as you expect. E.g. it doesn't support basic methods like sumor reshape, and you should treat this much as you'd treat the ordinary Python list [a, b](iterate over it to perform operations instead of using vectorized idioms).

给出一个可能会或可能不会像您期望的那样的数组。例如，它不支持像sumor 之类的基本方法reshape，您应该像对待普通 Python 列表一样对待[a, b]它（迭代它以执行操作而不是使用矢量化习语）。

Several possible workarounds exist; the easiest is to coerce aand bto a common length, perhaps using masked arraysor NaN to signal that some indices are invalid in some rows. E.g. here's bas a masked array:

存在几种可能的解决方法；最简单的方法是强制a并b使用公共长度，也许使用掩码数组或 NaN 来表示某些行中的某些索引无效。例如，这里b是一个掩码数组：

>>> ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])
masked_array(data = [1.0 1.0 --],
? ? ? ? ? ? ?mask = [False False ?True],
? ? ? ?fill_value = 1e+20)

This can be stacked with aas follows:

这可以堆叠a如下：

>>> ma.vstack([a, ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])])
masked_array(data =
 [[1.0 1.0 1.0]
 [1.0 1.0 --]],
             mask =
 [[False False False]
 [False False  True]],
       fill_value = 1e+20)

(For some purposes, scipy.sparsemay also be interesting.)

（出于某些目的，scipy.sparse也可能很有趣。）

Answer 2

回答by Vincenzooo

In general, there is an ambiguity in putting together arrays of different length because alignment of data might matter. Pandashas different advanced solutions to deal with that, e.g. to merge series into dataFrames.

通常，将不同长度的数组放在一起时会产生歧义，因为数据的对齐可能很重要。Pandas有不同的高级解决方案来处理这个问题，例如将系列合并到数据帧中。

If you just want to populate columns starting from first element, what I usually do is build a matrix and populate columns. Of course you need to fill the empty spaces in the matrix with a null value (in this case np.nan)

如果你只想从第一个元素开始填充列，我通常做的是构建一个矩阵并填充列。当然，您需要用空值填充矩阵中的空白空间（在本例中np.nan）

a = ones((3,))
b = ones((2,))
arraylist=[a,b]

outarr=np.ones((np.max([len(ps) for ps in arraylist]),len(arraylist)))*np.nan #define empty array
for i,c in enumerate(arraylist):  #populate columns
    outarr[:len(c),i]=c

In [108]: outarr
Out[108]: 
array([[  1.,   1.],
       [  1.,   1.],
       [  1.,  nan]])

Answer 3

回答by j08lue

There is a new library for efficiently handling this type of arrays: https://github.com/scikit-hep/awkward-array

有一个新的库可以有效地处理这种类型的数组：https: //github.com/scikit-hep/awkward-array

Answer 4

回答by JustinTime

I know this is a really old post and that there may be a better way of doing this, BUT why not just use append for such an operation:

我知道这是一个非常古老的帖子，并且可能有更好的方法来做到这一点，但是为什么不直接使用 append 进行这样的操作：

import numpy as np
a = np.ones((3,))
b = np.ones((2,))
c = np.append(a, b)
print(c)

output:

输出：

[1. 1. 1. 1. 1.]

Python 如何在 NumPy 中堆叠不同长度的向量？

提问by mac389

采纳答案by Fred Foo

回答by Vincenzooo

回答by j08lue

回答by JustinTime

相关推荐

最近更新

标签

Python 如何在 NumPy 中堆叠不同长度的向量？

提问by mac389

采纳答案by Fred Foo

回答by Vincenzooo

回答by j08lue

回答by JustinTime

相关推荐

Python 在列表中的特定索引处插入元素并返回更新后的列表

Python 计算二维 NumPy 数组的每一行和每一列内的非零元素

Python 获取lxml中元素的内部HTML

Python 如何比较 Django 模板中的日期

相关推荐

最近更新

标签