如何在 Pandas 中连接包含列表(系列)的两列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51870724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:56:50  来源:igfitidea点击:

How to concatenate two columns containing list (series) in Pandas

pythonpython-2.7pandasnumpyconcatenation

提问by twfx

I'd like to concatenate two columns in pandas. Each column consists of a list of floating points of 1x4 elements. I'd like to merge two columns such that the output is a vector of 1x8. The below shows a snippet of the dataframe

我想在Pandas中连接两列。每列包含一个 1x4 元素的浮点列表。我想合并两列,这样输出是一个 1x8 的向量。下面显示了数据框的片段

ue,bs
"[1.27932459e-01 7.83234197e-02 3.24789420e-02 4.34971932e-01]","[2.97806183e-01 2.32453145e-01 3.10236304e-01 1.69975788e-02]"
"[0.05627587 0.4113416  0.02160842 0.20420576]","[1.64862491e-01 1.35556330e-01 2.59050065e-02 1.42498115e-02]"

To concatenate two columns, I do the following:

要连接两列,我执行以下操作:

df['ue_bs'] = zip(df_join['ue'], df_join['bs'])

With this, I get a new column 'ue_bs' which contains the following for the first row of df['ue_bs']:

有了这个,我得到了一个新列“ue_bs”,其中第一行包含以下内容df['ue_bs']

(array([1.27932459e-01, 7.83234197e-02, 3.24789420e-02, 4.34971932e-01]),
 array([2.97806183e-01, 2.32453145e-01, 3.10236304e-01, 1.69975788e-02]))

However, they are still two arrays. In order to merge them, I did it as follows:

但是,它们仍然是两个数组。为了合并它们,我是这样做的:

a = df['ue_bs'][0]
np.concatenate((a[0], a[1]), axis=0)

Then, I got

然后,我得到

array([1.27932459e-01, 7.83234197e-02, 3.24789420e-02, 4.34971932e-01,
   2.97806183e-01, 2.32453145e-01, 3.10236304e-01, 1.69975788e-02])

I am wondering is there a neat way of doing this in single line of code, instead of having to loop through df['ue_bs']and perform np.concatenate()?

我想知道是否有一种巧妙的方法可以在单行代码中做到这一点,而不必循环df['ue_bs']执行np.concatenate()

采纳答案by Shaido - Reinstate Monica

To concatinate two lists in python, the easiest way is to use +. The same is true when concating columns in pandas. You can simply do:

要在 python 中连接两个列表,最简单的方法是使用+. 在 Pandas 中连接列时也是如此。你可以简单地做:

df['ue_bs'] = df['ue'] + df['bs']


If the column type is numpy arrays you can first convert them into normal python lists before the concatination:

如果列类型是 numpy 数组,您可以在连接之前先将它们转换为普通的 python 列表:

df['ue_bs'] = df['ue'].apply(lambda x: x.tolist()) + df['bs'].apply(lambda x: x.tolist())

回答by jezrael

Create 2d numpy array and then numpy.hstack:

创建二维 numpy 数组,然后numpy.hstack

a = np.array(df[['ue','bs']].values.tolist())
df['ue_bs'] = np.hstack((a[:, 0], a[:, 1])).tolist()

print (df.loc[0, 'ue_bs'])
[0.127932459, 0.0783234197, 0.032478942, 0.434971932, 
 0.297806183, 0.232453145, 0.310236304, 0.0169975788]