与 python 熊猫中的融化相反

Question

提问by Boris Gorelik

I cannot figure out how to do "reverse melt" using Pandas in python. This is my starting data

我无法弄清楚如何在 python 中使用 Pandas 进行“反向融化”。这是我的起始数据

import pandas as pd

from StringIO import StringIO

origin = pd.read_table(StringIO('''label    type    value
x   a   1
x   b   2
x   c   3
y   a   4
y   b   5
y   c   6
z   a   7
z   b   8
z   c   9'''))

origin
Out[5]: 
  label type  value
0     x    a      1
1     x    b      2
2     x    c      3
3     y    a      4
4     y    b      5
5     y    c      6
6     z    a      7
7     z    b      8
8     z    c      9

This is the output I would like to have:

这是我想要的输出：

    label   a   b   c
        x   1   2   3
        y   4   5   6
        z   7   8   9

I'm sure there is an easy way to do this, but I don't know how.

我确信有一种简单的方法可以做到这一点，但我不知道如何。

Answer 1

采纳答案by behzad.nouri

there are a few ways;
using .pivot:

有几种方法；
使用.pivot：

>>> origin.pivot(index='label', columns='type')['value']
type   a  b  c
label         
x      1  2  3
y      4  5  6
z      7  8  9

[3 rows x 3 columns]

using pivot_table:

使用pivot_table：

>>> origin.pivot_table(values='value', index='label', columns='type')
       value      
type       a  b  c
label             
x          1  2  3
y          4  5  6
z          7  8  9

[3 rows x 3 columns]

or .groupbyfollowed by .unstack:

或.groupby后跟.unstack：

>>> origin.groupby(['label', 'type'])['value'].aggregate('mean').unstack()
type   a  b  c
label         
x      1  2  3
y      4  5  6
z      7  8  9

[3 rows x 3 columns]

Answer 2

回答by ansev

`DataFrame.set_index`+ `DataFrame.unstack`

df.set_index(['label','type'])['value'].unstack()

type   a  b  c
label         
x      1  2  3
y      4  5  6
z      7  8  9

simplifying the passing of pivot arguments

简化枢轴参数的传递

df.pivot(*df)

type   a  b  c
label         
x      1  2  3
y      4  5  6
z      7  8  9

[*df]
#['label', 'type', 'value']

For expected output we need DataFrame.reset_indexand DataFrame.rename_axis

对于预期的输出，我们需要DataFrame.reset_index和DataFrame.rename_axis

df.pivot(*df).rename_axis(columns = None).reset_index()

  label  a  b  c
0     x  1  2  3
1     y  4  5  6
2     z  7  8  9

if there are duplicates in `a,b`columns we could lose information so we need `GroupBy.cumcount`

如果`a,b`列中有重复项，我们可能会丢失信息，因此我们需要`GroupBy.cumcount`

print(df)

  label type  value
0     x    a      1
1     x    b      2
2     x    c      3
3     y    a      4
4     y    b      5
5     y    c      6
6     z    a      7
7     z    b      8
8     z    c      9
0     x    a      1
1     x    b      2
2     x    c      3
3     y    a      4
4     y    b      5
5     y    c      6
6     z    a      7
7     z    b      8
8     z    c      9

df.pivot_table(index = ['label',
                        df.groupby(['label','type']).cumcount()],
               columns = 'type',
               values = 'value')


type     a  b  c
label           
x     0  1  2  3
      1  1  2  3
y     0  4  5  6
      1  4  5  6
z     0  7  8  9
      1  7  8  9

Or:

或者：

(df.assign(type_2 = df.groupby(['label','type']).cumcount())
   .set_index(['label','type','type_2'])['value']
   .unstack('type'))

与 python 熊猫中的融化相反

提问by Boris Gorelik

采纳答案by behzad.nouri

回答by ansev

`DataFrame.set_index`+ `DataFrame.unstack`

`DataFrame.set_index`+ `DataFrame.unstack`

if there are duplicates in `a,b`columns we could lose information so we need `GroupBy.cumcount`

如果`a,b`列中有重复项，我们可能会丢失信息，因此我们需要`GroupBy.cumcount`

相关推荐

最近更新

标签

与 python 熊猫中的融化相反

提问by Boris Gorelik

采纳答案by behzad.nouri

回答by ansev

DataFrame.set_index+ DataFrame.unstack

DataFrame.set_index+ DataFrame.unstack

if there are duplicates in a,bcolumns we could lose information so we need GroupBy.cumcount

如果a,b列中有重复项，我们可能会丢失信息，因此我们需要GroupBy.cumcount

相关推荐

Python：字符串的语义相似度得分

如何在python中制作带有按钮的窗口

如何将 wxPython 用于 Python 3？

Python - 将字符串打印到屏幕，在输出中包含 \n

相关推荐

最近更新

标签

`DataFrame.set_index`+ `DataFrame.unstack`

`DataFrame.set_index`+ `DataFrame.unstack`

if there are duplicates in `a,b`columns we could lose information so we need `GroupBy.cumcount`

如果`a,b`列中有重复项，我们可能会丢失信息，因此我们需要`GroupBy.cumcount`