Python Pandas Groupby 值范围

Question

提问by BJEBN

Is there an easy method in pandas to invoke groupbyon a range of values increments? For instance given the example below can I bin and group column Bwith a 0.155increment so that for example, the first couple of groups in column Bare divided into ranges between '0 - 0.155, 0.155 - 0.31 ...`

在 Pandas 中是否有一种简单的方法来调用groupby一系列值增量？例如，给出下面的示例，我可以B使用0.155增量对列进行分组和分组，例如，列B中的前几个组被划分为“0 - 0.155、0.155 - 0.31 ...”之间的范围

import numpy as np
import pandas as pd
df=pd.DataFrame({'A':np.random.random(20),'B':np.random.random(20)})

     A         B
0  0.383493  0.250785
1  0.572949  0.139555
2  0.652391  0.401983
3  0.214145  0.696935
4  0.848551  0.516692

Alternatively I could first categorize the data by those increments into a new column and subsequently use groupbyto determine any relevant statistics that may be applicable in column A?

或者，我可以首先按这些增量将数据分类到一个新列中，然后用于groupby确定可能适用于列的任何相关统计数据A？

Answer 1

采纳答案by DSM

You might be interested in pd.cut:

您可能对以下内容感兴趣pd.cut：

>>> df.groupby(pd.cut(df["B"], np.arange(0, 1.0+0.155, 0.155))).sum()
                      A         B
B                                
(0, 0.155]     2.775458  0.246394
(0.155, 0.31]  1.123989  0.471618
(0.31, 0.465]  2.051814  1.882763
(0.465, 0.62]  2.277960  1.528492
(0.62, 0.775]  1.577419  2.810723
(0.775, 0.93]  0.535100  1.694955
(0.93, 1.085]       NaN       NaN

[7 rows x 2 columns]

Answer 2

回答by Alvaro Fuentes

Try this:

尝试这个：

df = df.sort('B')
bins =  np.arange(0,1.0,0.155)
ind = np.digitize(df['B'],bins)

print df.groupby(ind).head()

Of course you can use any function on the groups not just head.

当然，您可以对组使用任何功能，而不仅仅是head.

Python Pandas Groupby 值范围

提问by BJEBN

采纳答案by DSM

回答by Alvaro Fuentes

相关推荐

最近更新

标签

Python Pandas Groupby 值范围

提问by BJEBN

采纳答案by DSM

回答by Alvaro Fuentes

相关推荐

Python Pandas 中布尔索引的逻辑运算符

Python pandas：选择数据框中所有零条目的列

Python 字典中的“TypeError: 'unicode' 对象不支持项目分配”

Python 如何检查变量是整数还是字符串？

相关推荐

最近更新

标签