Python Scipy Normaltest 是怎么用的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12838993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:58:11  来源:igfitidea点击:

Scipy Normaltest how is it used?

pythonscipy

提问by The Demz

I need to use normaltest in scipy for testing if the dataset is normal distributet. But I cant seem to find any good examples how to use scipy.stats.normaltest.

我需要在 scipy 中使用 normaltest 来测试数据集是否为正态分布。但我似乎找不到任何好的例子如何使用scipy.stats.normaltest.

My dataset has more than 100 values.

我的数据集有 100 多个值。

采纳答案by unutbu

In [12]: import scipy.stats as stats

In [13]: x = stats.norm.rvs(size = 100)

In [14]: stats.normaltest(x)
Out[14]: (1.627533590094232, 0.44318552909231262)

normaltestreturns a 2-tuple of the chi-squared statistic, and the associated p-value. Given the null hypothesis that xcame from a normal distribution, the p-value represents the probability that a chi-squared statistic that large (or larger) would be seen.

normaltest返回卡方统计量的 2 元组和关联的 p 值。给定x来自正态分布的原假设,p 值表示将看到大(或更大)卡方统计量的概率。

If the p-val is very small, it means it is unlikely that the data came from a normal distribution. For example:

如果 p-val 非常小,则意味着数据不太可能来自正态分布。例如:

In [15]: y = stats.uniform.rvs(size = 100)

In [16]: stats.normaltest(y)
Out[16]: (31.487039026711866, 1.4543748291516241e-07)

回答by The Demz

First i found out that scipy.stats.normaltest is almost the same. The mstats library is used for masked arrays. Arrays where you can mark values as invalid and not taken into the calculation.

首先我发现 scipy.stats.normaltest 几乎是一样的。mstats 库用于掩码数组。可以将值标记为无效且不计入计算的数组。

import numpy as np
import numpy.ma as ma
from scipy.stats import mstats

x = np.array([1, 2, 3, -1, 5, 7, 3]) #The array needs to be larger than 20, just an example
mx = ma.masked_array(x, mask=[0, 0, 0, 1, 0, 0, 0])
z,pval = mstats.normaltest(mx)

if(pval < 0.055):
    print "Not normal distribution"

"Traditionally, in statistics, you need a p-value of less than 0.05 to reject the null hypothesis." - http://mathforum.org/library/drmath/view/72065.html

“传统上,在统计学中,你需要一个小于 0.05 的 p 值来拒绝零假设。” - http://mathforum.org/library/drmath/view/72065.html