如何在 Pandas 中连接包含列表(系列)的两列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51870724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to concatenate two columns containing list (series) in Pandas
提问by twfx
I'd like to concatenate two columns in pandas. Each column consists of a list of floating points of 1x4 elements. I'd like to merge two columns such that the output is a vector of 1x8. The below shows a snippet of the dataframe
我想在Pandas中连接两列。每列包含一个 1x4 元素的浮点列表。我想合并两列,这样输出是一个 1x8 的向量。下面显示了数据框的片段
ue,bs
"[1.27932459e-01 7.83234197e-02 3.24789420e-02 4.34971932e-01]","[2.97806183e-01 2.32453145e-01 3.10236304e-01 1.69975788e-02]"
"[0.05627587 0.4113416 0.02160842 0.20420576]","[1.64862491e-01 1.35556330e-01 2.59050065e-02 1.42498115e-02]"
To concatenate two columns, I do the following:
要连接两列,我执行以下操作:
df['ue_bs'] = zip(df_join['ue'], df_join['bs'])
With this, I get a new column 'ue_bs' which contains the following for the first row of df['ue_bs']
:
有了这个,我得到了一个新列“ue_bs”,其中第一行包含以下内容df['ue_bs']
:
(array([1.27932459e-01, 7.83234197e-02, 3.24789420e-02, 4.34971932e-01]),
array([2.97806183e-01, 2.32453145e-01, 3.10236304e-01, 1.69975788e-02]))
However, they are still two arrays. In order to merge them, I did it as follows:
但是,它们仍然是两个数组。为了合并它们,我是这样做的:
a = df['ue_bs'][0]
np.concatenate((a[0], a[1]), axis=0)
Then, I got
然后,我得到
array([1.27932459e-01, 7.83234197e-02, 3.24789420e-02, 4.34971932e-01,
2.97806183e-01, 2.32453145e-01, 3.10236304e-01, 1.69975788e-02])
I am wondering is there a neat way of doing this in single line of code, instead of having to loop through df['ue_bs']
and perform np.concatenate()
?
我想知道是否有一种巧妙的方法可以在单行代码中做到这一点,而不必循环df['ue_bs']
执行np.concatenate()
?
采纳答案by Shaido - Reinstate Monica
To concatinate two lists in python, the easiest way is to use +
. The same is true when concating columns in pandas. You can simply do:
要在 python 中连接两个列表,最简单的方法是使用+
. 在 Pandas 中连接列时也是如此。你可以简单地做:
df['ue_bs'] = df['ue'] + df['bs']
If the column type is numpy arrays you can first convert them into normal python lists before the concatination:
如果列类型是 numpy 数组,您可以在连接之前先将它们转换为普通的 python 列表:
df['ue_bs'] = df['ue'].apply(lambda x: x.tolist()) + df['bs'].apply(lambda x: x.tolist())
回答by jezrael
Create 2d numpy array and then numpy.hstack
:
创建二维 numpy 数组,然后numpy.hstack
:
a = np.array(df[['ue','bs']].values.tolist())
df['ue_bs'] = np.hstack((a[:, 0], a[:, 1])).tolist()
print (df.loc[0, 'ue_bs'])
[0.127932459, 0.0783234197, 0.032478942, 0.434971932,
0.297806183, 0.232453145, 0.310236304, 0.0169975788]