pandas 将 numpy 数组转换为数据框列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44424594/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:44:53  来源:igfitidea点击:

Converting numpy array into dataframe column?

pythonpandasnumpydataframe

提问by Jane Sully

How do I convert a numpy array into a dataframe column. Let's say I have created an empty dataframe, df, and I loop through code to create 5 numpy arrays. Each iteration of my for loop, I want to convert the numpy array I have created in that iteration into a column in my dataframe. Just to clarify, I do not want to create a new dataframe every iteration of my loop, I only want to add a column to the existing one. The code I have below is sketchy and not syntactically correct, but illustrates my point.

如何将 numpy 数组转换为数据框列。假设我创建了一个空数据框df,然后我循环遍历代码以创建 5 个 numpy 数组。我的 for 循环的每次迭代,我想将我在该迭代中创建的 numpy 数组转换为我的数据帧中的一列。只是为了澄清,我不想在循环的每次迭代中都创建一个新的数据框,我只想向现有的一列添加一列。我下面的代码是粗略的,在语法上不正确,但说明了我的观点。

df = pd.dataframe()
for i in range(5):
   arr = create_numpy_arr(blah) # creates a numpy array
   df[i] = # convert arr to df column

回答by Casey Van Buren

That will work

那可行

import pandas as pd
import numpy as np

df = pd.DataFrame()

for i in range(5):
    arr = np.random.rand(10)
    df[i] = arr

Maybe a simpler way is to use the vectorization

也许更简单的方法是使用矢量化

arr = np.random.rand(10, 5)
df = pd.DataFrame(arr)

回答by user1596433

This is the simplest way:

这是最简单的方法:

df['column_name']=pd.Series(arr)

回答by Julio Cezar Silva

Since you want to create a column and not an entire DataFramefrom your array, you could do

由于您想DataFrame从数组中创建一列而不是整个,您可以这样做

import pandas as pd
import numpy as np

column_series = pd.Series(np.array([0, 1, 2, 3]))

To assign that column to an existing DataFrame:

将该列分配给现有的DataFrame

df = df.assign(column_name=column_series)

The above will add a column named column_nameinto df.

上面将添加一个名为column_nameinto的列df

If, instead, you don't have any DataFrameto assign those values to, you can pass a dictto the constructor to create a named column from your numpyarray:

相反,如果您没有DataFrame将这些值分配给任何值,您可以将 a 传递dict给构造函数以从您的numpy数组创建一个命名列:

df = pd.DataFrame({ 'column_name': np.array([0, 1, 2, 3]) })