pandas 如何将分位数应用于熊猫 groupby 对象?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14125428/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:33:54  来源:igfitidea点击:

How to apply quantile to pandas groupby object?

pythonpandas

提问by

I have a pandas groupby object called grouped. I can get grouped.mean()and other simple functions to work, but I cannot get grouped.quantile()to work. I get the following error when attempting to run grouped.quantile():

我有一个名为 .pandas groupby 的对象grouped。我可以让grouped.mean()和其他简单的功能工作,但我无法开始grouped.quantile()工作。尝试运行时出现以下错误grouped.quantile()

ValueError: ('invalid literal for float(): groupA', u'occurred at index groups')

I am grouping by text labels, so I am not sure why the function tries to convert it to a float. It should be computing the quantile using the floats within each group. Can someone help to point out what I am doing wrong?

我按文本标签分组,所以我不确定为什么该函数试图将其转换为浮点数。它应该使用每个组内的浮点数来计算分位数。有人可以帮忙指出我做错了什么吗?

采纳答案by Zelazny7

It looks like quantile() doesn't ignore the nuisance columns and is trying to find quantiles for your text columns. Here's a trivial example:

看起来 quantile() 并没有忽略令人讨厌的列,而是试图为您的文本列查找分位数。这是一个简单的例子:

In [75]: df = DataFrame({'col1':['A','A','B','B'], 'col2':[1,2,3,4]})

In [76]: df
Out[76]:
  col1  col2
0    A     1
1    A     2
2    B     3
3    B     4

In [77]: df.groupby('col1').quantile()
ValueError: ('could not convert string to float: A', u'occurred at index col1')

However, when I subset out only the numeric columns, I get:

但是,当我只对数字列进行子集化时,我得到:

In [78]: df.groupby('col1')['col2'].quantile()
Out[78]:
col1
A       1.5
B       3.5