Python 如何在 NumPy 中堆叠不同长度的向量?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14916407/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I stack vectors of different lengths in NumPy?
提问by mac389
How do I stack column-wise nvectors of shape (x,)where x could be any number?
如何堆叠x 可以是任意数字n的形状的列向量(x,)?
For example,
例如,
from numpy import *
a = ones((3,))
b = ones((2,))
c = vstack((a,b)) # <-- gives an error
c = vstack((a[:,newaxis],b[:,newaxis])) #<-- also gives an error
hstackworks fine but concatenates along the wrong dimension.
hstack工作正常,但沿错误的维度连接。
采纳答案by Fred Foo
Short answer: you can't. NumPy does not support jagged arrays natively.
简短的回答:你不能。NumPy 本身不支持锯齿状数组。
Long answer:
长答案:
>>> a = ones((3,))
>>> b = ones((2,))
>>> c = array([a, b])
>>> c
array([[ 1. 1. 1.], [ 1. 1.]], dtype=object)
gives an array that may or may notbehave as you expect. E.g. it doesn't support basic methods like sumor reshape, and you should treat this much as you'd treat the ordinary Python list [a, b](iterate over it to perform operations instead of using vectorized idioms).
给出一个可能会或可能不会像您期望的那样的数组。例如,它不支持像sumor 之类的基本方法reshape,您应该像对待普通 Python 列表一样对待[a, b]它(迭代它以执行操作而不是使用矢量化习语)。
Several possible workarounds exist; the easiest is to coerce aand bto a common length, perhaps using masked arraysor NaN to signal that some indices are invalid in some rows. E.g. here's bas a masked array:
存在几种可能的解决方法;最简单的方法是强制a并b使用公共长度,也许使用掩码数组或 NaN 来表示某些行中的某些索引无效。例如,这里b是一个掩码数组:
>>> ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])
masked_array(data = [1.0 1.0 --],
? ? ? ? ? ? ?mask = [False False ?True],
? ? ? ?fill_value = 1e+20)
This can be stacked with aas follows:
这可以堆叠a如下:
>>> ma.vstack([a, ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])])
masked_array(data =
[[1.0 1.0 1.0]
[1.0 1.0 --]],
mask =
[[False False False]
[False False True]],
fill_value = 1e+20)
(For some purposes, scipy.sparsemay also be interesting.)
(出于某些目的,scipy.sparse也可能很有趣。)
回答by Vincenzooo
In general, there is an ambiguity in putting together arrays of different length because alignment of data might matter. Pandashas different advanced solutions to deal with that, e.g. to merge series into dataFrames.
通常,将不同长度的数组放在一起时会产生歧义,因为数据的对齐可能很重要。Pandas有不同的高级解决方案来处理这个问题,例如将系列合并到数据帧中。
If you just want to populate columns starting from first element, what I usually do is build a matrix and populate columns. Of course you need to fill the empty spaces in the matrix with a null value (in this case np.nan)
如果你只想从第一个元素开始填充列,我通常做的是构建一个矩阵并填充列。当然,您需要用空值填充矩阵中的空白空间(在本例中np.nan)
a = ones((3,))
b = ones((2,))
arraylist=[a,b]
outarr=np.ones((np.max([len(ps) for ps in arraylist]),len(arraylist)))*np.nan #define empty array
for i,c in enumerate(arraylist): #populate columns
outarr[:len(c),i]=c
In [108]: outarr
Out[108]:
array([[ 1., 1.],
[ 1., 1.],
[ 1., nan]])
回答by j08lue
There is a new library for efficiently handling this type of arrays: https://github.com/scikit-hep/awkward-array
有一个新的库可以有效地处理这种类型的数组:https: //github.com/scikit-hep/awkward-array
回答by JustinTime
I know this is a really old post and that there may be a better way of doing this, BUT why not just use append for such an operation:
我知道这是一个非常古老的帖子,并且可能有更好的方法来做到这一点,但是为什么不直接使用 append 进行这样的操作:
import numpy as np
a = np.ones((3,))
b = np.ones((2,))
c = np.append(a, b)
print(c)
output:
输出:
[1. 1. 1. 1. 1.]

