Pandas:按四舍五入的浮点数分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34683963/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: Group by rounded floating number
提问by Tom Bennett
I have a dataframe with a column of floating numbers. For example:
我有一个带有浮点数列的数据框。例如:
df = pd.DataFrame({'A' : np.random.randn(100), 'B': np.random.randn(100)})
What I want to do is to group by column A after rounding column A to 2 decimal places.
我想要做的是在将 A 列四舍五入到 2 个小数位后按 A 列分组。
The way I do it is highly inefficient:
我这样做的方式非常低效:
df.groupby(df.A.map(lambda x: "%.2f" % x))
I particularly don't want to convert everything to a string, as speed becomes a huge problem. But I don't feel it is safe to do the following:
我特别不想将所有内容都转换为字符串,因为速度成为一个大问题。但我认为执行以下操作并不安全:
df.groupby(np.around(df.A, 2))
I am not sure, but I feel that there might be cases where two float64 numbers will have the same string representation after rounding to 2 decimal places, but might have slightly different representations when np.around to 2 decimal places. For example, is it possible a string representation of 1.52 can be represented by np.around(., 2) as 1.52000001 sometimes but 1.51999999 some other times?
我不确定,但我觉得可能存在这样的情况,即两个 float64 数字在四舍五入到小数点后 2 位后将具有相同的字符串表示,但当 np.around 到小数点后 2 位时,表示可能略有不同。例如,1.52 的字符串表示是否可以由 np.around(., 2) 表示为 1.52000001 有时但 1.51999999 有时?
My question is what is a better and more efficient way.
我的问题是什么是更好、更有效的方法。
采纳答案by xmduhan
I think you not need to convert float to string.
我认为您不需要将浮点数转换为字符串。
import pandas as pd
from random import random
df = pd.DataFrame({'A' : map(lambda x: random(), range(100000)), 'B': map(lambda x: random(), range(100000))})
df.groupby(df['A'].apply(lambda x: round(x, 1))).count()