Python Pandas:多列合并为一列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23410083/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: Multiple columns into one column
提问by user2929063
I have the following data (2 columns, 4 rows):
我有以下数据(2 列,4 行):
Column 1: A, B, C, D
Column 2: E, F, G, H
I am attempting to combine the columns into one column to look like this (1 column, 8 rows):
我试图将列合并为一列,如下所示(1 列,8 行):
Column 3: A, B, C, D, E, F, G, H
I am using pandas DataFrame and have tried using different functions with no success (append
, concat
, etc.). Any help would be most appreciated!
我使用的熊猫数据框,并使用不同的功能,但没有成功(试过append
,concat
等)。非常感激任何的帮助!
采纳答案by EdChum
Update
更新
pandas has a built in method for this stack
which does what you want see the other answer.
pandas 有一个内置的方法,它stack
可以执行您想要的操作,请参阅其他答案。
This was my first answer before I knew about stack
many years ago:
这是我stack
多年前知道的第一个答案:
In [227]:
df = pd.DataFrame({'Column 1':['A', 'B', 'C', 'D'],'Column 2':['E', 'F', 'G', 'H']})
df
Out[227]:
Column 1 Column 2
0 A E
1 B F
2 C G
3 D H
[4 rows x 2 columns]
In [228]:
df['Column 1'].append(df['Column 2']).reset_index(drop=True)
Out[228]:
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
dtype: object
回答by mechanical_meat
What you appear to be asking is simply for help on creating another view of your data. If there is no reason those data are in two columns in the first place then just create one column. If however you need to combine them for presentation in some other tool you can do something like:
您似乎只是在寻求帮助来创建另一个数据视图。如果没有理由这些数据首先位于两列中,那么只需创建一列。但是,如果您需要将它们组合起来以在其他工具中进行演示,您可以执行以下操作:
import itertools as it, pandas as pd
df = pd.DataFrame({1:['a','b','c','d'],2:['e','f','g','h']})
sorted(it.chain(*df.values))
# -> ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
回答by Zero
You can flatten the values in column direction using ravel
, is much faster.
您可以使用ravel
, 将列方向的值展平,速度要快得多。
In [1238]: df
Out[1238]:
Column 1 Column 2
0 A E
1 B F
2 C G
3 D H
In [1239]: pd.Series(df.values.ravel('F'))
Out[1239]:
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
dtype: object
Details
细节
Medium
中等的
In [1245]: df.shape
Out[1245]: (4000, 2)
In [1246]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 86.2 μs per loop
In [1247]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
1000 loops, best of 3: 816 μs per loop
Large
大的
In [1249]: df.shape
Out[1249]: (40000, 2)
In [1250]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 87.5 μs per loop
In [1251]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
100 loops, best of 3: 1.72 ms per loop
回答by Nickpick
The trick is to use stack()
诀窍是使用 stack()
df.stack().reset_index()
level_0 level_1 0
0 0 Column 1 A
1 0 Column 2 E
2 1 Column 1 B
3 1 Column 2 F
4 2 Column 1 C
5 2 Column 2 G
6 3 Column 1 D
7 3 Column 2 H