pandas 数据透视表错误:此时不支持 1 ndim Categorical

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38663150/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:42:13  来源:igfitidea点击:

Pivot table error:1 ndim Categorical are not supported at this time

pythonpandaspivot

提问by Lisa

My goal is to box-plot the 'score' by 'label', I don't care about "date" and "Cusip". I want to use 'pivot' to reshape the data, so that each Label is in one column and I can boxplot it.

我的目标是按“标签”绘制“分数”,我不关心“日期”和“Cusip”。我想使用“枢轴”来重塑数据,以便每个标签都在一列中,我可以对其进行箱线图。

              date   Cusip    Label Score
663182  2015-07-31  00846UAG    AAA 138.15
663183  2015-07-31  00846UAH    AAA 171.93
663184  2015-07-31  00846UAJ    AAA 175.67
663185  2015-07-31  023767AA    BB  187.92
663186  2015-07-31  023770AA    BB  176.25

t.pivot(index=['date','Cusip'],columns='Label',values='Score')

Errors shows:

错误显示:

NotImplementedError: > 1 ndim Categorical are not supported at this time

More details:

更多细节:

C:\Anaconda3\lib\site-packages\pandas\core\categorical.py in __init__(self, values, categories, ordered, name, fastpath, levels)
    285             try:
--> 286                 codes, categories = factorize(values, sort=True)
    287             except TypeError:

C:\Anaconda3\lib\site-packages\pandas\core\algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    184     uniques = vec_klass()
--> 185     labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
    186 

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_labels (pandas\hashtable.c:13921)()

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

采纳答案by Nickil Maveli

You really should be using pivot_tableas you have got duplicate entries in your datecolumn.

您真的应该使用,pivot_table因为您的date列中有重复的条目。

pd.pivot_table(df, values='Score', index=['date', 'Cusip'], columns=['Label']).boxplot()

alt text

替代文字

回答by citynorman

As an alternative to .pivot_table(), which might do unwanted aggregations, you can do

作为.pivot_table()可能会进行不需要的聚合的替代方法,您可以执行

df.set_index(['date', 'Cusip','Label'])['Score'].unstack()