Python pivot_table 没有要聚合的数字类型

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39229005/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:00:45  来源:igfitidea点击:

pivot_table No numeric types to aggregate

pythonpandas

提问by DougKruger

I want to make a pivot table from the following dataframe with columns sales, rep. The pivot table shows salesbut no rep. When I tried with only rep, I got the error DataError: No numeric types to aggregate. How to fix this such that I see both the numeric field salesand the field(string) rep

我想从以下数据框中创建一个数据透视表,其中包含列sales, rep。数据透视表显示sales但没有rep。当我尝试使用 only 时rep,出现错误DataError: No numeric types to aggregate。如何解决这个问题,以便我看到数字字段sales和字段(字符串)rep

data = {'year': ['2016', '2016', '2015', '2014', '2013'],
        'country':['uk', 'usa', 'fr','fr','uk'],
        'sales': [10, 21, 20, 10,12],
        'rep': ['john', 'john', 'claire', 'kyle','kyle']
        }

print pd.DataFrame(data).pivot_table(index='country', columns='year', values=['rep','sales'])

        sales               
year     2013 2014 2015 2016
country                     
fr        NaN   10   20  NaN
uk         12  NaN  NaN   10
usa       NaN  NaN  NaN   21


print pd.DataFrame(data).pivot_table(index='country', columns='year', values=['rep'])
DataError: No numeric types to aggregate

回答by unutbu

You could use set_indexand unstack:

你可以使用set_indexunstack

df = pd.DataFrame(data)
df.set_index(['year','country']).unstack('year')

yields

产量

          rep                     sales                  
year     2013  2014    2015  2016  2013  2014  2015  2016
country                                                  
fr       None  kyle  claire  None   NaN  10.0  20.0   NaN
uk       kyle  None    None  john  12.0   NaN   NaN  10.0
usa      None  None    None  john   NaN   NaN   NaN  21.0

Or, using pivot_tablewith aggfunc='first':

或者,使用pivot_tablewith aggfunc='first'

df.pivot_table(index='country', columns='year', values=['rep','sales'], aggfunc='first')

yields

产量

          rep                     sales                  
year     2013  2014    2015  2016  2013  2014  2015  2016
country                                                  
fr       None  kyle  claire  None  None    10    20  None
uk       kyle  None    None  john    12  None  None    10
usa      None  None    None  john  None  None  None    21

With aggfunc='first', each (country, year, rep)or (country, year, sales)group is aggregrated by taking the first value found. In your case there appears to be no duplicates, so the first value is the same as the only value.

使用aggfunc='first', 通过获取找到的第一个值来聚合每个(country, year, rep)(country, year, sales)组。在您的情况下,似乎没有重复项,因此第一个值与唯一值相同。

回答by Psidom

It seems that the problem comes from the different types for column rep and sales, if you convert the sales to strtype and specify the aggfunc as sum, it works fine:

似乎问题来自列代表和销售的不同类型,如果您将销售转换为strtype 并将 aggfunc 指定为sum,则可以正常工作:

df.sales = df.sales.astype(str)

pd.pivot_table(df, index=['country'], columns=['year'], values=['rep', 'sales'], aggfunc='sum')

#        rep                            sales
#  year 2013    2014    2015    2016    2013    2014    2015    2016
# country                               
# fr    None    kyle    claire  None    None      10      20    None
# uk    kyle    None    None    john      12    None    None    10
#usa    None    None    None    john    None    None    None    21