pandas 将 numpy 数组转换为数据框列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44424594/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting numpy array into dataframe column?
提问by Jane Sully
How do I convert a numpy array into a dataframe column. Let's say I have created an empty dataframe, df
, and I loop through code to create 5 numpy arrays. Each iteration of my for loop, I want to convert the numpy array I have created in that iteration into a column in my dataframe. Just to clarify, I do not want to create a new dataframe every iteration of my loop, I only want to add a column to the existing one. The code I have below is sketchy and not syntactically correct, but illustrates my point.
如何将 numpy 数组转换为数据框列。假设我创建了一个空数据框df
,然后我循环遍历代码以创建 5 个 numpy 数组。我的 for 循环的每次迭代,我想将我在该迭代中创建的 numpy 数组转换为我的数据帧中的一列。只是为了澄清,我不想在循环的每次迭代中都创建一个新的数据框,我只想向现有的一列添加一列。我下面的代码是粗略的,在语法上不正确,但说明了我的观点。
df = pd.dataframe()
for i in range(5):
arr = create_numpy_arr(blah) # creates a numpy array
df[i] = # convert arr to df column
回答by Casey Van Buren
That will work
那可行
import pandas as pd
import numpy as np
df = pd.DataFrame()
for i in range(5):
arr = np.random.rand(10)
df[i] = arr
Maybe a simpler way is to use the vectorization
也许更简单的方法是使用矢量化
arr = np.random.rand(10, 5)
df = pd.DataFrame(arr)
回答by user1596433
This is the simplest way:
这是最简单的方法:
df['column_name']=pd.Series(arr)
回答by Julio Cezar Silva
Since you want to create a column and not an entire DataFrame
from your array, you could do
由于您想DataFrame
从数组中创建一列而不是整个,您可以这样做
import pandas as pd
import numpy as np
column_series = pd.Series(np.array([0, 1, 2, 3]))
To assign that column to an existing DataFrame
:
将该列分配给现有的DataFrame
:
df = df.assign(column_name=column_series)
The above will add a column named column_name
into df
.
上面将添加一个名为column_name
into的列df
。
If, instead, you don't have any DataFrame
to assign those values to, you can pass a dict
to the constructor to create a named column from your numpy
array:
相反,如果您没有DataFrame
将这些值分配给任何值,您可以将 a 传递dict
给构造函数以从您的numpy
数组创建一个命名列:
df = pd.DataFrame({ 'column_name': np.array([0, 1, 2, 3]) })