pandas 大熊猫如何计算偏斜

Question

提问by piRSquared

I'm calculating a coskew matrix and wanted to double check my calculation with pandas built in skewmethod. I could not reconcile how pandas performing the calculation.

我正在计算一个 coskew 矩阵，并想用内置的 Pandasskew方法仔细检查我的计算。我无法协调Pandas如何执行计算。

define my series as:

将我的系列定义为：

import pandas as pd

series = pd.Series(
    {0: -0.051917457635120283,
     1: -0.070071606515280632,
     2: -0.11204865874074735,
     3: -0.14679988245503134,
     4: -0.088062467095565145,
     5: 0.17579741198527793,
     6: -0.10765856028420773,
     7: -0.11971470229167547,
     8: -0.15169210769159247,
     9: -0.038616800990881606,
     10: 0.16988162977411481,
     11: 0.092999418364443032}
)

I compared the following calculations and expected them to be the same.

我比较了以下计算并预计它们是相同的。

pandas

Pandas

series.skew()

1.1119637586658944

me

我

(((series - series.mean()) / series.std(ddof=0)) ** 3).mean()

0.967840223081231

me - take 2

我 - 拿 2

This is significantly different. I thought it might be Fisher-Pearson coefficient. So I did:

这是明显不同的。我认为这可能是Fisher-Pearson 系数。所以我做了：

n = len(series)
skew = series.sub(series.mean()).div(series.std(ddof=0)).apply(lambda x: x ** 3).mean()
skew * (n * (n - 1)) ** 0.5 / (n - 1)

1.0108761442417222

Still off by quite a bit.

还是差了很多。

Question

题

How does pandas calculate skew?

大Pandas如何计算偏斜？

Answer 1

采纳答案by jezrael

I found scipy.stats.skewwith parameter bias=Falsereturn equal output, so I think in pandas skewis bias=Falseby default:

我发现scipy.stats.skew参数bias=False返回相等的输出，所以我认为 inpandas skew是bias=False默认的：

bias : bool
If False, then the calculations are corrected for statistical bias.

偏见：布尔
如果为 False，则针对统计偏差对计算进行校正。

import pandas as pd
import scipy.stats.stats as stats

series = pd.Series(
    {0: -0.051917457635120283,
     1: -0.070071606515280632,
     2: -0.11204865874074735,
     3: -0.14679988245503134,
     4: -0.088062467095565145,
     5: 0.17579741198527793,
     6: -0.10765856028420773,
     7: -0.11971470229167547,
     8: -0.15169210769159247,
     9: -0.038616800990881606,
     10: 0.16988162977411481,
     11: 0.092999418364443032}
)

print (series.skew())
1.11196375867

print (stats.skew(series, bias=False))
1.1119637586658944

Not sure for 100%, but I think I find it in code

不确定 100%，但我想我在代码中找到了

EDIT (piRSquared)

编辑（piRSquared）

From scipyskewcode

从scipyskew代码

if not bias:
    can_correct = (n > 2) & (m2 > 0)
    if can_correct.any():
        m2 = np.extract(can_correct, m2)
        m3 = np.extract(can_correct, m3)
        nval = ma.sqrt((n-1.0)*n)/(n-2.0)*m3/m2**1.5
        np.place(vals, can_correct, nval)
return vals

The adjustment was (n * (n - 1)) ** 0.5 / (n - 2)and not (n * (n - 1)) ** 0.5 / (n - 1)

调整是(n * (n - 1)) ** 0.5 / (n - 2)和不是(n * (n - 1)) ** 0.5 / (n - 1)

pandas 大熊猫如何计算偏斜

提问by piRSquared

pandas

Pandas

me

我

me - take 2

我 - 拿 2

Question

题

采纳答案by jezrael

相关推荐

最近更新

标签

pandas 大熊猫如何计算偏斜

提问by piRSquared

pandas

Pandas

me

我

me - take 2

我 - 拿 2

Question

题

采纳答案by jezrael

相关推荐

Python 词干提取（使用 Pandas 数据框）

pandas 大熊猫读取以逗号分隔的千位分隔符格式的 CSV 数据

pandas 来自熊猫数据框python的barh图中行的不同颜色

Pandas 从列表创建数据框列

相关推荐

最近更新

标签