Pandas groupby 如何计算范围内的计数

Question

提问by user2366975

Say I have a huge list of numbers between 0 and 100. I compute ranges, depending on the max number and then saying there are 10 bins. So my ranges are for example

假设我有一个巨大的 0 到 100 之间的数字列表。我根据最大数字计算范围，然后说有 10 个 bin。所以我的范围是例如

ranges = [0,10,20,30,40,50,60,70,80,90,100]

Now I count the occurances in each range from 0-10, 10-20, and so on. I iterate over every number in the list and check for a range. I assume this is not the best way in terms of runtime speed.

现在我计算每个范围内的出现次数，从 0-10、10-20 等等。我遍历列表中的每个数字并检查范围。我认为这不是运行速度方面的最佳方式。

Can I fasten it up by using pandas, e.g. pandas.groupby, and how?

我可以使用Pandas来固定它，例如 pandas.groupby，以及如何固定？

Answer 1

回答by EdChum

We can use pd.cutto bin the values into ranges, then we can groupbythese ranges, and finally call countto count the values now binned into these ranges:

我们可以使用pd.cut将值合并到范围内，然后我们可以使用groupby这些范围，最后调用count计算现在合并到这些范围内的值：

In [82]:

df = pd.DataFrame({"a": np.random.random_integers(0, high=100, size=100)})
ranges = [0,10,20,30,40,50,60,70,80,90,100]
df.groupby(pd.cut(df.a, ranges)).count()
Out[82]:
            a
a            
(0, 10]    10
(10, 20]    6
(20, 30]   12
(30, 40]    9
(40, 50]   11
(50, 60]   12
(60, 70]    9
(70, 80]   13
(80, 90]    9
(90, 100]   9

Pandas groupby 如何计算范围内的计数

提问by user2366975

回答by EdChum

相关推荐

最近更新

标签

Pandas groupby 如何计算范围内的计数

提问by user2366975

回答by EdChum

相关推荐

从协议缓冲区创建一个类似 Python 字典的对象，以在 Pandas 中使用

Pandas：连接数据框并保留重复的索引

pandas Panda 的 DataFrame - 重命名多个同名列

Pandas Dataframe apply() 方法提供了一个行对象，但是如何访问索引值

相关推荐

最近更新

标签