使用枢轴的 Pandas KeyError
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37150248/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas KeyError using pivot
提问by trob
I'm new to Python and I would like to use Python to replicate a common excel task. If such a question has already been answered, please let me know. I've been unable to find it. I have the following pandas dataframe (data):
我是 Python 新手,我想使用 Python 来复制常见的 excel 任务。如果已经回答了这样的问题,请告诉我。我一直无法找到它。我有以下Pandas数据框(数据):
Date Stage SubStage Value
12/31/2015 1.00 a 0.896882891
1/1/2016 1.00 a 0.0458843
1/2/2016 1.00 a 0.126805588
1/3/2016 1.00 b 0.615824461
1/4/2016 1.00 b 0.245092069
1/5/2016 1.00 c 0.121936318
1/6/2016 1.00 c 0.170198128
1/7/2016 1.00 c 0.735872415
1/8/2016 1.00 c 0.542361912
1/4/2016 2.00 a 0.723769247
1/5/2016 2.00 a 0.305570257
1/6/2016 2.00 b 0.47461605
1/7/2016 2.00 b 0.173702623
1/8/2016 2.00 c 0.969260251
1/9/2016 2.00 c 0.017170798
In excel, I can use a pivot table to produce the following:
在 excel 中,我可以使用数据透视表来生成以下内容:
excel pivot table using 'data'
It seems reasonable to do the following in python:
在 python 中执行以下操作似乎是合理的:
data.pivot(index='Date',columns = ['Stage','SubStage'],values = 'Value')
But that produces:
但这会产生:
KeyError: 'Level Stage not found'
What gives?
是什么赋予了?
回答by Paul H
You want .pivot_table
, not .pivot
.
你想要.pivot_table
,没有.pivot
。
import pandas
from io import StringIO
x = StringIO("""\
Date Stage SubStage Value
12/31/2015 1.00 a 0.896882891
1/1/2016 1.00 a 0.0458843
1/2/2016 1.00 a 0.126805588
1/3/2016 1.00 b 0.615824461
1/4/2016 1.00 b 0.245092069
1/5/2016 1.00 c 0.121936318
1/6/2016 1.00 c 0.170198128
1/7/2016 1.00 c 0.735872415
1/8/2016 1.00 c 0.542361912
1/4/2016 2.00 a 0.723769247
1/5/2016 2.00 a 0.305570257
1/6/2016 2.00 b 0.47461605
1/7/2016 2.00 b 0.173702623
1/8/2016 2.00 c 0.969260251
1/9/2016 2.00 c 0.017170798
""")
df = pandas.read_table(x, sep='\s+')
xtab = df.pivot_table(index='Date', columns=['Stage','SubStage'], values='Value')
print(xtab.to_string(na_rep='--'))
And that gives me:
这给了我:
Stage 1.0 2.0
SubStage a b c a b c
Date
1/1/2016 0.045884 -- -- -- -- --
1/2/2016 0.126806 -- -- -- -- --
1/3/2016 -- 0.615824 -- -- -- --
1/4/2016 -- 0.245092 -- 0.723769 -- --
1/5/2016 -- -- 0.121936 0.305570 -- --
1/6/2016 -- -- 0.170198 -- 0.474616 --
1/7/2016 -- -- 0.735872 -- 0.173703 --
1/8/2016 -- -- 0.542362 -- -- 0.969260
1/9/2016 -- -- -- -- -- 0.017171
12/31/2015 0.896883 -- -- -- -- --