与 python 熊猫中的融化相反
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22127569/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Opposite of melt in python pandas
提问by Boris Gorelik
I cannot figure out how to do "reverse melt" using Pandas in python. This is my starting data
我无法弄清楚如何在 python 中使用 Pandas 进行“反向融化”。这是我的起始数据
import pandas as pd
from StringIO import StringIO
origin = pd.read_table(StringIO('''label type value
x a 1
x b 2
x c 3
y a 4
y b 5
y c 6
z a 7
z b 8
z c 9'''))
origin
Out[5]:
label type value
0 x a 1
1 x b 2
2 x c 3
3 y a 4
4 y b 5
5 y c 6
6 z a 7
7 z b 8
8 z c 9
This is the output I would like to have:
这是我想要的输出:
label a b c
x 1 2 3
y 4 5 6
z 7 8 9
I'm sure there is an easy way to do this, but I don't know how.
我确信有一种简单的方法可以做到这一点,但我不知道如何。
采纳答案by behzad.nouri
there are a few ways;
using .pivot:
有几种方法;
使用.pivot:
>>> origin.pivot(index='label', columns='type')['value']
type a b c
label
x 1 2 3
y 4 5 6
z 7 8 9
[3 rows x 3 columns]
using pivot_table:
使用pivot_table:
>>> origin.pivot_table(values='value', index='label', columns='type')
value
type a b c
label
x 1 2 3
y 4 5 6
z 7 8 9
[3 rows x 3 columns]
or .groupbyfollowed by .unstack:
>>> origin.groupby(['label', 'type'])['value'].aggregate('mean').unstack()
type a b c
label
x 1 2 3
y 4 5 6
z 7 8 9
[3 rows x 3 columns]
回答by ansev
DataFrame.set_index+ DataFrame.unstack
DataFrame.set_index+ DataFrame.unstack
df.set_index(['label','type'])['value'].unstack()
type a b c
label
x 1 2 3
y 4 5 6
z 7 8 9
simplifying the passing of pivot arguments
简化枢轴参数的传递
df.pivot(*df)
type a b c
label
x 1 2 3
y 4 5 6
z 7 8 9
[*df]
#['label', 'type', 'value']
For expected output we need DataFrame.reset_indexand DataFrame.rename_axis
对于预期的输出,我们需要DataFrame.reset_index和DataFrame.rename_axis
df.pivot(*df).rename_axis(columns = None).reset_index()
label a b c
0 x 1 2 3
1 y 4 5 6
2 z 7 8 9
if there are duplicates in a,bcolumns we could lose information so we need GroupBy.cumcount
如果a,b列中有重复项,我们可能会丢失信息,因此我们需要GroupBy.cumcount
print(df)
label type value
0 x a 1
1 x b 2
2 x c 3
3 y a 4
4 y b 5
5 y c 6
6 z a 7
7 z b 8
8 z c 9
0 x a 1
1 x b 2
2 x c 3
3 y a 4
4 y b 5
5 y c 6
6 z a 7
7 z b 8
8 z c 9
df.pivot_table(index = ['label',
df.groupby(['label','type']).cumcount()],
columns = 'type',
values = 'value')
type a b c
label
x 0 1 2 3
1 1 2 3
y 0 4 5 6
1 4 5 6
z 0 7 8 9
1 7 8 9
Or:
或者:
(df.assign(type_2 = df.groupby(['label','type']).cumcount())
.set_index(['label','type','type_2'])['value']
.unstack('type'))

