pandas 如何用十进制创建熊猫系列？

Question

提问by cjm2671

I'm calculating some standard deviations which are giving FloatingPointErrors. I wanted to try converting the data series to Decimal (using https://docs.python.org/3/library/decimal.html), to see if this fixes my issue.

我正在计算一些导致 FloatingPointErrors 的标准偏差。我想尝试将数据系列转换为十进制（使用https://docs.python.org/3/library/decimal.html），看看这是否能解决我的问题。

I can't seem to make a pandas series of decimal.

我似乎无法制作Pandas系列decimal。

How can I take a normal pd.Seriesof float64and convert to a pd.Seriesof decimal, such that I can do:

我怎样才能取一个法线pd.Seriesoffloat64并转换为一个pd.Seriesof decimal，这样我就可以做到：

Series.pct_change().ewm(span=35, min_periods=35).std()

Answer 1

回答by SerialDev

would something like this work?

像这样的工作吗？

def column_round(decimals):
     return partial(Series.round, decimals=decimals)

df.apply(column_round(2))

alternatively lets use np.vectorizeso we can use decimal.quantizefunction to do rounding, this will leave the variable as a decimal instead of np.float64

或者让np.vectorize我们使用，以便我们可以使用decimal.quantize函数进行四舍五入，这将使变量保留为小数而不是np.float64

npquantize = np.vectorize(decimal.Decimal.quantize)

I have been looking into it and this seems to solve the issue with pct_change

我一直在研究它，这似乎解决了 pct_change 的问题

ts.diff().div(ts.shift(1))

Answer 2

回答by ChesuCR

I think you can create the DataFrame directly with Decimal types and operate with the values

我认为您可以直接使用 Decimal 类型创建 DataFrame 并使用这些值进行操作

import pandas as pd
import numpy as np
from decimal import *

df = pd.DataFrame({
    'DECIMAL_1': [Decimal('2342.2345234'), Decimal('564.5678'), Decimal('76867.8923892')],
    'DECIMAL_2': [Decimal('67867.43534534323'), Decimal('67876.345345'), Decimal('234234.2345345')]
})
df['DECIMAL_3'] = df['DECIMAL_1'] + df['DECIMAL_2']
df.dtypes

The drawback could be that the columns dtypeis going to be objectand the performance will decrease, I am afraid. Anyway, I think that any operation with the Decimalwill require more computation than operating with floats.

缺点可能是列dtype将会出现object并且性能会降低，我担心。无论如何，我认为任何使用的操作Decimal都需要比使用浮点数进行更多的计算。

Maybe the best solution is to have a copy of the DataFrame. One DF with floats and the other one with Decimal. If you need to make fast operations you can use the DF with floats, if you need to compare or assign new values to some cells with some specific precision you can use the DF created with Decimal.

也许最好的解决方案是拥有 DataFrame 的副本。一个带浮点数的 DF 和另一个带小数的 DF。如果您需要进行快速操作，您可以使用带浮点数的 DF，如果您需要比较某些单元格或为某些特定精度的单元格分配新值，您可以使用使用 Decimal 创建的 DF。

Tell me what you think about my suggestions.

告诉我你对我的建议的看法。

Note:I made my example with DataFrame, but a DataFrame is built with Series

注意：我用 DataFrame 做了我的例子，但是 DataFrame 是用 Series 构建的

pandas 如何用十进制创建熊猫系列？

提问by cjm2671

回答by SerialDev

回答by ChesuCR

相关推荐

最近更新

标签

pandas 如何用十进制创建熊猫系列？

提问by cjm2671

回答by SerialDev

回答by ChesuCR

相关推荐

pandas 如何从列中的值中删除重音符号？

pandas 识别连续出现的值

使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

如何在 Redis 中设置/获取 pandas.DataFrame？

相关推荐

最近更新

标签