Pandas：按四舍五入的浮点数分组

Question

提问by Tom Bennett

I have a dataframe with a column of floating numbers. For example:

我有一个带有浮点数列的数据框。例如：

df = pd.DataFrame({'A' : np.random.randn(100), 'B': np.random.randn(100)})

What I want to do is to group by column A after rounding column A to 2 decimal places.

我想要做的是在将 A 列四舍五入到 2 个小数位后按 A 列分组。

The way I do it is highly inefficient:

我这样做的方式非常低效：

df.groupby(df.A.map(lambda x: "%.2f" % x))

I particularly don't want to convert everything to a string, as speed becomes a huge problem. But I don't feel it is safe to do the following:

我特别不想将所有内容都转换为字符串，因为速度成为一个大问题。但我认为执行以下操作并不安全：

df.groupby(np.around(df.A, 2))

I am not sure, but I feel that there might be cases where two float64 numbers will have the same string representation after rounding to 2 decimal places, but might have slightly different representations when np.around to 2 decimal places. For example, is it possible a string representation of 1.52 can be represented by np.around(., 2) as 1.52000001 sometimes but 1.51999999 some other times?

我不确定，但我觉得可能存在这样的情况，即两个 float64 数字在四舍五入到小数点后 2 位后将具有相同的字符串表示，但当 np.around 到小数点后 2 位时，表示可能略有不同。例如，1.52 的字符串表示是否可以由 np.around(., 2) 表示为 1.52000001 有时但 1.51999999 有时？

My question is what is a better and more efficient way.

我的问题是什么是更好、更有效的方法。

Answer 1

采纳答案by xmduhan

I think you not need to convert float to string.

我认为您不需要将浮点数转换为字符串。

import pandas as pd
from random import random
df = pd.DataFrame({'A' : map(lambda x: random(), range(100000)), 'B': map(lambda x: random(), range(100000))})
df.groupby(df['A'].apply(lambda x: round(x, 1))).count()

Pandas：按四舍五入的浮点数分组

提问by Tom Bennett

采纳答案by xmduhan

I think you not need to convert float to string.

我认为您不需要将浮点数转换为字符串。

相关推荐

最近更新

标签

Pandas：按四舍五入的浮点数分组

提问by Tom Bennett

采纳答案by xmduhan

I think you not need to convert float to string.

我认为您不需要将浮点数转换为字符串。

相关推荐

透视包含字符串的 Pandas 数据框 - “没有可聚合的数字类型”错误

pandas StringIO 和熊猫 read_csv

pandas Scikit 学习/熊猫中的线性回归和梯度下降？

pandas 聚合数据并获得总和和计数

相关推荐

最近更新

标签