pandas 为索引使用多列旋转 DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49943627/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:29:55  来源:igfitidea点击:

Pivoting DataFrame with multiple columns for the index

pythonpandasdataframepivot-table

提问by ProgSky

I have a dataframe and I want to transpose only few rows to column.

我有一个数据框,我只想将几行转为列。

This is what I have now.

这就是我现在所拥有的。

   Entity   Name        Date  Value
0     111  Name1  2018-03-31    100
1     111  Name2  2018-02-28    200
2     222  Name3  2018-02-28   1000
3     333  Name1  2018-01-31   2000

I want to create date as the column and then add value. Something like this:

我想创建日期作为列,然后添加值。像这样的东西:

   Entity   Name  2018-01-31  2018-02-28  2018-03-31
0     111  Name1         NaN         NaN       100.0
1     111  Name2         NaN       200.0         NaN
2     222  Name3         NaN      1000.0         NaN
3     333  Name1      2000.0         NaN         NaN

I can have identical Namefor two different Entitys. Here is an updated dataset.

Name对于两个不同的Entitys,我可以有相同的。这是一个更新的数据集。

Code:

代码:

import pandas as pd
import datetime

data1 = {
         'Entity': [111,111,222,333],
         'Name': ['Name1','Name2', 'Name3','Name1'],
         'Date': [datetime.date(2018,3, 31), datetime.date(2018,2,28), datetime.date(2018,2,28), datetime.date(2018,1,31)],
         'Value': [100,200,1000,2000]
    }
df1 = pd.DataFrame(data1, columns= ['Entity','Name','Date', 'Value'])

How do I achieve this? Any pointers? Thanks all.

我如何实现这一目标?任何指针?谢谢大家。

回答by cs95

Based on your update, you'd need pivot_tablewith two index columns -

根据您的更新,您需要pivot_table两个索引列 -

v = df1.pivot_table(
        index=['Entity', 'Name'], 
         columns='Date', 
         values='Value'
).reset_index()
v.index.name = v.columns.name = None

v
   Entity   Name  2018-01-31  2018-02-28  2018-03-31
0     111  Name1         NaN         NaN       100.0
1     111  Name2         NaN       200.0         NaN
2     222  Name3         NaN      1000.0         NaN
3     333  Name1      2000.0         NaN         NaN

回答by YOBEN_S

From unstack

unstack

df1.set_index(['Entity','Name','Date']).Value.unstack().reset_index()

Date  Entity   Name  2018-01-31 00:00:00  2018-02-28 00:00:00  \
0        111  Name1                  NaN                  NaN   
1        111  Name2                  NaN                200.0   
2        222  Name3                  NaN               1000.0   
3        333  Name1               2000.0                  NaN   

Date  2018-03-31 00:00:00  
0                   100.0  
1                     NaN  
2                     NaN  
3                     NaN