是否有函数可以检索 Pandas 中系列的直方图计数？

Question

提问by Rafael S. Calsaverini

There is a method to plotSeries histograms, but is there a function to retrieve the histogram counts to do further calculations on top of it?

有一种绘制系列直方图的方法，但是是否有一个函数可以检索直方图计数以在其之上进行进一步的计算？

I keep using numpy's functions to do this and converting the result to a DataFrame or Series when I need this. It would be nice to stay with pandas objects the whole time.

我一直使用 numpy 的函数来执行此操作，并在需要时将结果转换为 DataFrame 或 Series。一直和 pandas 对象呆在一起会很好。

Answer 1

采纳答案by Andy Hayden

If your Series was discrete you could use value_counts:

如果您的系列是离散的，您可以使用value_counts：

In [11]: s = pd.Series([1, 1, 2, 1, 2, 2, 3])

In [12]: s.value_counts()
Out[12]:
2    3
1    3
3    1
dtype: int64

You can see that s.hist()is essentially equivalent to s.value_counts().plot().

你可以看到它s.hist()本质上等同于s.value_counts().plot().

If it was of floats an awful hacky solution could be to use groupby:

如果它是浮动的，一个糟糕的解决方案可能是使用 groupby：

s.groupby(lambda i: np.floor(2*s[i]) / 2).count()

Answer 2

回答by Dan Allan

Since histand value_countsdon't use the Series' index, you may as well treat the Series like an ordinary array and use np.histogramdirectly. Then build a Series from the result.

既然hist并value_counts没有使用Series的索引，你不妨把Series当作普通数组np.histogram直接使用。然后根据结果构建一个系列。

In [4]: s = Series(randn(100))

In [5]: counts, bins = np.histogram(s)

In [6]: Series(counts, index=bins[:-1])
Out[6]: 
-2.968575     1
-2.355032     4
-1.741488     5
-1.127944    26
-0.514401    23
 0.099143    23
 0.712686    12
 1.326230     5
 1.939773     0
 2.553317     1
dtype: int32

This is a really convenient way to organize the result of a histogram for subsequent computation.

这是一种为后续计算组织直方图结果的非常方便的方法。

To index by the centerof each bin instead of the left edge, you could use bins[:-1] + np.diff(bins)/2.

要按每个 bin的中心而不是左边缘进行索引，您可以使用bins[:-1] + np.diff(bins)/2.

Answer 3

回答by IanS

If you know the number of bins you want, you can use pandas' cutfunction, which is now accessible via value_counts. Using the same random example:

如果你知道你想要的 bin 数量，你可以使用 pandas 的cut函数，现在可以通过value_counts. 使用相同的随机示例：

s = pd.Series(np.random.randn(100))
s.value_counts(bins=5)

Out[55]: 
(-0.512, 0.311]     40
(0.311, 1.133]      25
(-1.335, -0.512]    14
(1.133, 1.956]      13
(-2.161, -1.335]     8

是否有函数可以检索 Pandas 中系列的直方图计数？

提问by Rafael S. Calsaverini

采纳答案by Andy Hayden

回答by Dan Allan

回答by IanS

相关推荐

最近更新

标签

是否有函数可以检索 Pandas 中系列的直方图计数？

提问by Rafael S. Calsaverini

采纳答案by Andy Hayden

回答by Dan Allan

回答by IanS

相关推荐

如何：Python Pandas 获取当前股票数据

在 pandas groupby 之后删除一个组

pandas.to_datetime 错误

pandas 如何使用python pandas从数据框中删除重复的列

相关推荐

最近更新

标签