Python 如何将 Numpy 数组转换为 Panda DataFrame

Question

提问by Yannick

I have a Numpy array that looks like this:

我有一个 Numpy 数组，如下所示：

[400.31865662]
[401.18514808]
[404.84015554]
[405.14682194]
[405.67735105]
[273.90969447]
[274.0894528]

When I try to convert it to a Panda Dataframe with the following code

当我尝试使用以下代码将其转换为 Panda Dataframe 时

y = pd.DataFrame(data)
print(y)

I get the following output when printing it. Why do I get all those zéros?

打印时我得到以下输出。为什么我得到所有这些零？

            0
0  400.318657
            0
0  401.185148
            0
0  404.840156
            0
0  405.146822
            0
0  405.677351
            0
0  273.909694
            0
0  274.089453

I would like to get a single column dataframe which looks like that:

我想获得一个看起来像这样的单列数据框：

400.31865662
401.18514808
404.84015554
405.14682194
405.67735105
273.90969447
274.0894528

Answer 1

回答by Dani Mesejo

You could flattenthe numpy array:

您可以展平numpy 数组：

import numpy as np
import pandas as pd

data = [[400.31865662],
        [401.18514808],
        [404.84015554],
        [405.14682194],
        [405.67735105],
        [273.90969447],
        [274.0894528]]

arr = np.array(data)

df = pd.DataFrame(data=arr.flatten())

print(df)

Output

输出

            0
0  400.318657
1  401.185148
2  404.840156
3  405.146822
4  405.677351
5  273.909694
6  274.089453

Answer 2

回答by Yannick

I just figured out my mistake. (data) was a list of arrays:

我刚刚发现我的错误。(data) 是一个数组列表：

[array([400.0290173]), array([400.02253235]), array([404.00252113]), array([403.99466754]), array([403.98681395]), array([271.97896036]), array([271.97110677])]

So I used np.vstack(data)to concatenate it

所以我用来np.vstack(data)连接它

conc = np.vstack(data)

[[400.0290173 ]
 [400.02253235]
 [404.00252113]
 [403.99466754]
 [403.98681395]
 [271.97896036]
 [271.97110677]]

Then I convert the concatened array into a Pandas Dataframe by using the

然后我使用

newdf = pd.DataFrame(conc)


    0
0  400.029017
1  400.022532
2  404.002521
3  403.994668
4  403.986814
5  271.978960
6  271.971107

Et voilà!

等等！

Answer 3

回答by akshayk07

There is another way, which isn't mentioned in the other answers. If you have a NumPy array which is essentially a row vector (or column vector) i.e. shape like (n, ), then you could do the following :

还有另一种方式，其他答案中没有提到。如果您有一个 NumPy 数组，它本质上是一个行向量（或列向量），即形状像(n, )，那么您可以执行以下操作：

# sample array
x = np.zeros((20))
# empty dataframe
df = pd.DataFrame()
# add the array to df as a column
df['column_name'] = x

This way you can add multiple arrays as separate columns.

通过这种方式，您可以将多个数组添加为单独的列。

Answer 4

回答by Nicolas Gervais

Since I assume the many visitors of this post aren't here for OP's specific and un-reproducible issue, here's a general answer:

由于我认为这篇文章的许多访问者不是为了 OP 的特定且不可重现的问题而来到这里的，因此这里有一个通用的答案：

df = pd.DataFrame(array)

Here's an example. The strength of pandasis to be nice for the eye (like Excel), so it's important to use column names.

这是一个例子。的优点pandas是美观（如 Excel），因此使用列名很重要。

import numpy as np
import pandas as pd

array = np.random.rand(5, 5)

array([[0.723, 0.177, 0.659, 0.573, 0.476],
       [0.77 , 0.311, 0.533, 0.415, 0.552],
       [0.349, 0.768, 0.859, 0.273, 0.425],
       [0.367, 0.601, 0.875, 0.109, 0.398],
       [0.452, 0.836, 0.31 , 0.727, 0.303]])

columns = [f'col_{num}' for num in range(5)]
index = [f'index_{num}' for num in range(5)]

Here's where the magic happens:

这就是魔法发生的地方：

df = pd.DataFrame(array, columns=columns, index=index)

            col_0     col_1     col_2     col_3     col_4
index_0  0.722791  0.177427  0.659204  0.572826  0.476485
index_1  0.770118  0.311444  0.532899  0.415371  0.551828
index_2  0.348923  0.768362  0.858841  0.273221  0.424684
index_3  0.366940  0.600784  0.875214  0.108818  0.397671
index_4  0.451682  0.836315  0.310480  0.727409  0.302597

Python 如何将 Numpy 数组转换为 Panda DataFrame

提问by Yannick

回答by Dani Mesejo

回答by Yannick

回答by akshayk07

回答by Nicolas Gervais

相关推荐

最近更新

标签

Python 如何将 Numpy 数组转换为 Panda DataFrame

提问by Yannick

回答by Dani Mesejo

回答by Yannick

回答by akshayk07

回答by Nicolas Gervais

相关推荐

Python 绘图 matplotlib.pyplot 中的箭头

Python 如何为 xgboost 实施增量训练？

Python 计算熊猫列中 False 或 True 的出现次数

Python 使用 Seaborn 在一张图中绘制多个不同的图

相关推荐

最近更新

标签