pandas 将 numpy 数组数组转换为二维数组

Question

提问by Nate Stemen

I have a pandas series featuresthat has the following values (features.values)

我有一个Pandas系列features，它具有以下值 ( features.values)

array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0]), ...,
       array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0])], dtype=object)

Now I really want this to be recognized as matrix, but if I do

现在我真的希望它被识别为矩阵，但如果我这样做

>>> features.values.shape
(10000,)

rather than (10000, 3000)which is what I would expect.

而不是(10000, 3000)我所期望的。

How can I get this to be recognized as 2d rather than a 1d array with arrays as values. Also why does it not automatically detect it as a 2d array?

我怎样才能让它被识别为二维而不是一个以数组为值的一维数组。另外为什么它不会自动将其检测为二维数组？

Answer 1

回答by hpaulj

In response your comment question, let's compare 2 ways of creating an array

为了回应您的评论问题，让我们比较两种创建数组的方法

First make an array from a list of arrays (all same length):

首先从数组列表中创建一个数组（所有长度相同）：

In [302]: arr = np.array([np.arange(3), np.arange(1,4), np.arange(10,13)])
In [303]: arr
Out[303]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

The result is a 2d array of numbers.

结果是一个二维数字数组。

If instead we make an object dtype array, and fill it with arrays:

如果我们创建一个对象 dtype 数组，并用数组填充它：

In [304]: arr = np.empty(3,object)
In [305]: arr[:] = [np.arange(3), np.arange(1,4), np.arange(10,13)]
In [306]: arr
Out[306]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

Notice that this display is like yours. This is, by design a 1d array. Like a list it contains pointers to arrays elsewhere in memory. Notice that it requires an extra construction step. The default behavior of np.arrayis to create a multidimensional array where it can.

请注意，此显示与您的一样。这是，设计为一维数组。就像一个列表，它包含指向内存中其他地方的数组的指针。请注意，它需要一个额外的构建步骤。的默认行为np.array是尽可能创建一个多维数组。

It takes extra effort to get around that. Likewise it takes some extra effort to undo that - to create the 2d numeric array.

解决这个问题需要额外的努力。同样，要撤消它需要一些额外的努力 - 创建 2d 数值数组。

Simply calling np.arrayon it does not change the structure.

简单地调用np.array它不会改变结构。

In [307]: np.array(arr)
Out[307]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

stackdoes change it to 2d. stacktreats it as a list of arrays, which it joins on a new axis.

stack确实将其更改为 2d。 stack将它视为一个数组列表，它连接到一个新轴上。

In [308]: np.stack(arr)
Out[308]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

pandas 将 numpy 数组数组转换为二维数组

提问by Nate Stemen

回答by hpaulj

相关推荐

最近更新

标签

pandas 将 numpy 数组数组转换为二维数组

提问by Nate Stemen

回答by hpaulj

相关推荐

pandas 三维熊猫数据帧错误“必须通过二维输入”

pandas 检查数据框是否具有零元素

pandas 如何用熊猫列的最大值替换无限值？

Python：将 XML 提取到 DataFrame (Pandas)

相关推荐

最近更新

标签