Python Pandas:多列合并为一列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23410083/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:56:08  来源:igfitidea点击:

Pandas: Multiple columns into one column

pythonpandas

提问by user2929063

I have the following data (2 columns, 4 rows):

我有以下数据(2 列,4 行):

Column 1: A, B, C, D

Column 2: E, F, G, H

I am attempting to combine the columns into one column to look like this (1 column, 8 rows):

我试图将列合并为一列,如下所示(1 列,8 行):

Column 3: A, B, C, D, E, F, G, H

I am using pandas DataFrame and have tried using different functions with no success (append, concat, etc.). Any help would be most appreciated!

我使用的熊猫数据框,并使用不同的功能,但没有成功(试过appendconcat等)。非常感激任何的帮助!

采纳答案by EdChum

Update

更新

pandas has a built in method for this stackwhich does what you want see the other answer.

pandas 有一个内置的方法,它stack可以执行您想要的操作,请参阅其他答案

This was my first answer before I knew about stackmany years ago:

这是我stack多年前知道的第一个答案:

In [227]:

df = pd.DataFrame({'Column 1':['A', 'B', 'C', 'D'],'Column 2':['E', 'F', 'G', 'H']})
df
Out[227]:
  Column 1 Column 2
0        A        E
1        B        F
2        C        G
3        D        H

[4 rows x 2 columns]

In [228]:

df['Column 1'].append(df['Column 2']).reset_index(drop=True)
Out[228]:
0    A
1    B
2    C
3    D
4    E
5    F
6    G
7    H
dtype: object

回答by mechanical_meat

What you appear to be asking is simply for help on creating another view of your data. If there is no reason those data are in two columns in the first place then just create one column. If however you need to combine them for presentation in some other tool you can do something like:

您似乎只是在寻求帮助来创建另一个数据视图。如果没有理由这些数据首先位于两列中,那么只需创建一列。但是,如果您需要将它们组合起来以在其他工具中进行演示,您可以执行以下操作:

import itertools as it, pandas as pd
df = pd.DataFrame({1:['a','b','c','d'],2:['e','f','g','h']})
sorted(it.chain(*df.values))
# -> ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

回答by Zero

You can flatten the values in column direction using ravel, is much faster.

您可以使用ravel, 将列方向的值展平,速度要快得多。

In [1238]: df
Out[1238]:
  Column 1 Column 2
0        A        E
1        B        F
2        C        G
3        D        H

In [1239]: pd.Series(df.values.ravel('F'))
Out[1239]:
0    A
1    B
2    C
3    D
4    E
5    F
6    G
7    H
dtype: object


Details

细节

Medium

中等的

In [1245]: df.shape
Out[1245]: (4000, 2)

In [1246]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 86.2 μs per loop

In [1247]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
1000 loops, best of 3: 816 μs per loop

Large

大的

In [1249]: df.shape
Out[1249]: (40000, 2)

In [1250]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 87.5 μs per loop

In [1251]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
100 loops, best of 3: 1.72 ms per loop

回答by Nickpick

The trick is to use stack()

诀窍是使用 stack()

df.stack().reset_index()

   level_0   level_1  0
0        0  Column 1  A
1        0  Column 2  E
2        1  Column 1  B
3        1  Column 2  F
4        2  Column 1  C
5        2  Column 2  G
6        3  Column 1  D
7        3  Column 2  H