Python 如何使用 numpy/scipy 执行两样本单尾 t 检验

Question

提问by Timo

In R, it is possible to perform two-sample one-tailed t-test simply by using

在中R，可以简单地通过使用进行两样本单尾 t 检验

> A = c(0.19826790, 1.36836629, 1.37950911, 1.46951540, 1.48197798, 0.07532846)
> B = c(0.6383447, 0.5271385, 1.7721380, 1.7817880)
> t.test(A, B, alternative="greater")

    Welch Two Sample t-test

data:  A and B 
t = -0.4189, df = 6.409, p-value = 0.6555
alternative hypothesis: true difference in means is greater than 0 
95 percent confidence interval:
 -1.029916       Inf 
sample estimates:
mean of x mean of y 
0.9954942 1.1798523

In Python world, scipyprovides similar function ttest_ind, but which can only do two-tailed t-tests. Closest information on the topic I found is thislink, but it seems to be rather a discussion of the policy of implementing one-tailed vs two-tailed in scipy.

在 Python 世界中，scipy提供了类似的函数ttest_ind，但它只能做双尾 t 检验。我发现的有关该主题的最接近的信息是此链接，但这似乎是对在scipy.

Therefore, my question is that does anyone know any examples or instructions on how to perform one-tailed version of the test using numpy/scipy?

因此，我的问题是，有没有人知道有关如何使用numpy/scipy?

Answer 1

采纳答案by lvc

From your mailing list link:

从您的邮件列表链接：

because the one-sided tests can be backed out from the two-sided tests. (With symmetric distributions one-sided p-value is just half of the two-sided pvalue)

因为单边测试可以从双边测试中退出。（对于对称分布，一侧 p 值只是两侧 p 值的一半）

It goes on to say that scipy always gives the test statistic as signed. This means that given p and t values from a two-tailed test, you would reject the null hypothesis of a greater-than test when p/2 < alpha and t > 0, and of a less-than test when p/2 < alpha and t < 0.

它继续说 scipy 总是给出带符号的测试统计量。这意味着给定来自双尾检验的 p 和 t 值，当时您将拒绝大于检验p/2 < alpha and t > 0和小于检验的零假设p/2 < alpha and t < 0。

Answer 2

回答by evapanda

When null hypothesis is Ho: P1>=P2and alternative hypothesis is Ha: P1<P2. In order to test it in Python, you write ttest_ind(P2,P1). (Notice the position is P2 first).

当原假设为Ho: P1>=P2且备择假设为时Ha: P1<P2。为了在 Python 中对其进行测试，您需要编写ttest_ind(P2,P1). （注意位置是P2第一）。

first = np.random.normal(3,2,400)
second = np.random.normal(6,2,400)
stats.ttest_ind(first, second, axis=0, equal_var=True)

You will get the result like below Ttest_indResult(statistic=-20.442436213923845,pvalue=5.0999336686332285e-75)

你会得到如下结果 Ttest_indResult(statistic=-20.442436213923845,pvalue=5.0999336686332285e-75)

In Python, when statstic <0your real p-value is actually real_pvalue = 1-output_pvalue/2= 1-5.0999336686332285e-75/2, which is approximately 0.99. As your p-value is larger than 0.05, you cannot reject the null hypothesis that 6>=3. when statstic >0, the real z score is actually equal to -statstic, the real p-value is equal to pvalue/2.

在 Python 中，当statstic <0您的实际 p 值实际上是时real_pvalue = 1-output_pvalue/2= 1-5.0999336686332285e-75/2，大约为 0.99。由于您的 p 值大于 0.05，您不能拒绝 6>=3 的原假设。当statstic >0，实际 z 分数实际上等于时-statstic，实际 p 值等于 pvalue/2。

Ivc's answer should be when (1-p/2) < alpha and t < 0, you can reject the less than hypothesis.

Ivc 的答案应该是 when (1-p/2) < alpha and t < 0，你可以拒绝小于假设。

Answer 3

回答by bpirvu

After trying to add some insights as comments to the accepted answer but not being able to properly write them down due to general restrictions upon comments, I decided to put my two cents in as a full answer.

在尝试将一些见解作为评论添加到已接受的答案中，但由于评论的一般限制而无法正确写下它们之后，我决定将我的两分钱作为完整答案。

First let's formulate our investigative question properly. The data we are investigating is

首先让我们正确地表述我们的调查问题。我们正在调查的数据是

A = np.array([0.19826790, 1.36836629, 1.37950911, 1.46951540, 1.48197798, 0.07532846])
B = np.array([0.6383447, 0.5271385, 1.7721380, 1.7817880])

with the sample means

与样本均值

A.mean() = 0.99549419
B.mean() = 1.1798523

I assume that since the mean of B is obviously greater than the mean of A, you would like to check if this result is statistically significant.

我假设由于 B 的均值明显大于 A 的均值，因此您想检查此结果是否具有统计显着性。

So we have the Null Hypothesis

所以我们有零假设

H0: A >= B

that we would like to reject in favor of the Alternative Hypothesis

我们想拒绝支持替代假设

H1: B > A

Now when you call scipy.stats.ttest_ind(x, y), this makes a Hypothesis Test on the value of x.mean()-y.mean(), which means that in order to get positive values throughout the calculation (which simplifies all considerations) we have to call

现在，当您调用时scipy.stats.ttest_ind(x, y)，这会对的值进行假设检验x.mean()-y.mean()，这意味着为了在整个计算过程中获得正值（这简化了所有考虑），我们必须调用

stats.ttest_ind(B,A)

instead of stats.ttest_ind(B,A). We get as an answer

而不是stats.ttest_ind(B,A). 我们得到了答案

t-value = 0.42210654140239207
p-value = 0.68406235191764142

t-value = 0.42210654140239207
p-value = 0.68406235191764142

and since according to the documentationthis is the output for a two-tailed t-test we must divide the pby 2 for our one-tailed test. So depending on the Significance Level alphayou have chosen you need

并且由于根据文档，这是双尾 t 检验的输出，因此我们必须将其p除以 2 以进行单尾检验。因此，根据alpha您选择的显着性水平，您需要

p/2 < alpha

in order to reject the Null Hypothesis H0. For alpha=0.05this is clearly not the case so you cannot rejectH0.

为了拒绝零假设H0。因为alpha=0.05这显然不是这样，所以你不能拒绝H0。

An alternative way to decide if you reject H0without having to do any algebra on tor pis by looking at the t-value and comparing it with the critical t-value t_critat the desired level of confidence (e.g. 95%) for the number of degrees of freedom dfthat applies to your problem. Since we have

来决定的另一种方式，如果你拒绝H0，而无需做任何代数t或p是通过查看t值，并将其与临界T值进行比较t_crit的置信所期望的水平（例如，95％）的程度的数量df适用于您的问题的自由。既然我们有

df = sample_size_1 + sample_size_2 - 2 = 8

we get from a statistical table like this onethat

我们从统计表格得到像这一个是

t_crit(df=8, confidence_level=95%) = 1.860

We clearly have

我们显然有

t < t_crit

so we obtain again the same result, namely that we cannot rejectH0.

所以我们再次得到相同的结果，即我们不能拒绝H0。

Answer 4

回答by Jorge

Did you look at this: How to calculate the statistics "t-test" with numpy

你看过这个：如何用numpy计算统计数据“t-test”

I think that is exactly what this questions is looking at.

我认为这正是这个问题所关注的。

Basically:

基本上：

import scipy.stats
x = [1,2,3,4]
scipy.stats.ttest_1samp(x, 0)

Ttest_1sampResult(statistic=3.872983346207417, pvalue=0.030466291662170977)

is the same result as this example in R. https://stats.stackexchange.com/questions/51242/statistical-difference-from-zero

与 R 中的此示例结果相同。https://stats.stackexchange.com/questions/51242/statistical-difference-from-zero

Python 如何使用 numpy/scipy 执行两样本单尾 t 检验

提问by Timo

采纳答案by lvc

回答by evapanda

回答by bpirvu

回答by Jorge

相关推荐

最近更新

标签

Python 如何使用 numpy/scipy 执行两样本单尾 t 检验

提问by Timo

采纳答案by lvc

回答by evapanda

回答by bpirvu

回答by Jorge

相关推荐

使用 Python 正则表达式提取数据

Python 使用beautifulsoup在div中获取儿童的文本

Python 导入模块中全局变量的可见性

Python 如何使用列表理解来获得两个列表的并集？

相关推荐

最近更新

标签