pandas 熊猫将列类型从列表转换为 np.array

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39618678/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:03:49  来源:igfitidea点击:

Pandas convert columns type from list to np.array

pythonpandasnumpydataframecasting

提问by LeoCella

I'm trying to apply a function to a pandas dataframe, such a function required two np.array as input and it fit them using a well defined model.

我正在尝试将一个函数应用于Pandas数据框,这样的函数需要两个 np.array 作为输入,它使用定义良好的模型来拟合它们。

The point is that I'm not able to apply this function starting from the selected columns since their "rows" contain list read from a JSON file and not np.array.

关键是我无法从选定的列开始应用此函数,因为它们的“行”包含从 JSON 文件读取的列表,而不是 np.array。

Now, I've tried different solutions:

现在,我尝试了不同的解决方案:

#Here is where I discover the problem

train_df['result'] = train_df.apply(my_function(train_df['col1'],train_df['col2']))

#so I've tried to cast the Series before passing them to the function in both these ways:

X_col1_casted = trai_df['col1'].dtype(np.array)
X_col2_casted = trai_df['col2'].dtype(np.array)

doesn't work.

不起作用。

X_col1_casted = trai_df['col1'].astype(np.array)
X_col2_casted = trai_df['col2'].astype(np.array)

doesn't work.

不起作用。

X_col1_casted = trai_df['col1'].dtype(np.array)
X_col2_casted = trai_df['col2'].dtype(np.array)

does'nt work.

不起作用。

What I'm thinking to do now is a long procedure like:

我现在想做的是一个很长的程序,例如:

starting from the uncasted column-series, convert them into list(), iterate on them apply the function to the np.array() single elements, and append the results into a temporary list. Once done I will convert this list into a new column. ( clearly, I don't know if it will work )

从未转换的列系列开始,将它们转换为 list(),对它们进行迭代,将函数应用于 np.array() 单个元素,并将结果附加到临时列表中。完成后,我会将这个列表转换为一个新列。(显然,我不知道它是否会起作用)

Does anyone of you know how to help me ?

你们中有人知道如何帮助我吗?

EDIT: I add one example to be clear:

编辑:为了清楚起见,我添加了一个示例:

The function assume to have as input two np.arrays. Now it has two lists since they are retrieved form a json file. The situation is this one:

该函数假设有两个 np.array 作为输入。现在它有两个列表,因为它们是从 json 文件中检索的。情况是这样的:

col1        col2    result
[1,2,3]     [4,5,6]  [5,7,9]
[0,0,0]     [1,2,3]  [1,2,3]

Clearly the function is not the sum one, but a own function. For a moment assume that this sum can work only starting from arrays and not form lists, what should I do ?

显然这个函数不是求和函数,而是一个自己的函数。暂时假设这个总和只能从数组开始而不是从列表开始,我该怎么办?

Thanks in advance

提前致谢

回答by Nickil Maveli

Use applyto convert each element to it's equivalent array:

使用apply每个元素转换为它的等效阵:

df['col1'] = df['col1'].apply(lambda x: np.array(x))

type(df['col1'].iloc[0])
numpy.ndarray

Data:

数据:

df = pd.DataFrame({'col1': [[1,2,3],[0,0,0]]})
df

Image

图片