假设检验 Hypothesis Testing

本文逻辑导图

A hypothesis test is a procedure that allows us to "confidently" reject a hypothesis if it is clearly statistically inconsistent with data.

1. 基本概念

1.1 假设检验四要素及基本过程

四要素:

  • Null hypothesis (原假设)
  • Alternative hypothesis (备择假设) or
  • Test statistics (统计量)
  • Rejection region (拒绝域)

基本过程:

  • 根据试验需要,提出原假设和备择假设
  • 收集试验数据,计算统计量
  • 若统计量落在拒绝域内,则拒绝原假设;否则,无法拒绝原假设。

Note 1:关于和

  • 并不是互补的或对称的。可以包含关于总体分布的一切使不成立的命题。
  • 在实际操作中,通常将希望予以拒绝的假设作为,而将希望予以支持的假设作为.

This is because hypothesis tests are designed to avoid rejecting when it is true. Therefore when the test rejects , one can be quite sure that is false. 这里涉及到下面要说的“假设检验中的两类错误”。

Note 2:关于统计量

  • 根据检验目标(均值、方差...)的不同,会使用不同的统计量。
  • 其理论根据源于:中心极限定理,正态分布的性质,Likelihood Ratio Test, Pearson's test等,具体见下。

Note 3:关于拒绝域

  • 拒绝域是在成立的前提下,通过事先确定的显著性水平以及,计算出来的一个区间。
  • 它代表的是一个小概率事件。
  • 如果这个小概率事件发生了,则说明原假设在大概率上是错误的,于是我们拒绝原假设。

1.2 假设检验两类错误

  • Type I Error: rejecting when it is true
    避免这类错误是首要
    用 表示犯这类错误的概率
    也被称作significance level(显著性水平)

  • Type II Error: not rejecting when it is false
    用 表示犯这类错误的概率
    被称作检验的 power

Type I error 和 Type II error 的关系:
We can always reduce the type I error by making the rejection region smaller. This will typically at the expense of larger type II error.
In practice,we want to have powerful tests with a given type I error.

1.3 P-values

The P-value is the smallest for which the given observed data (once you have done the random experiment) suggests rejection of

Smaller P-value indicates rejection of the null hypothesis.

2. 常用假设检验及其原理

2.1 中心极限定理 Central Limit Theorem

are independently and identically distributed, with and known. Then

2.1.1 大样本均值检验
  • 假设: To test the hypothesis

    against one of these alternative hypothesis:
    ; or
    ; or

  • 统计量:

  • 拒绝域 (RR):
    Define as where . Then
    (1) for , the RR is
    (2) for , the RR is
    (3) for , the RR is

Note: If the variance (总体方差) is unknown, you can replace it by (样本方差), since is large.

2.1.2 小样本均值检验

小样本情况下,上述CLT中的正态分布可以用分布近似,即

  • 假设:同上

  • 统计量:

  • 拒绝域:
    Define as where . Then
    (1) for , the RR is
    (2) for , the RR is
    (3) for , the RR is

2.2 正态分布的性质

, then

2.2.1 正态分布均值检验

过程同2.1.1 大样本均值检验

2.2.2 正态分布方差检验
  • 假设: To test the hypothesis

    against one of these alternatives:


  • 统计量:

  • 拒绝域:
    Define and as

    where . Then

(1) for , the RR is

(2) for , the RR is

(3) for , the RR is
or

2.3 似然比 Likelihood Ratio Tests

, then we have

(1) The likelihood of is

(2) Suppose ,
where are some sets of possible parameter values and .

Define generalized likelihood ratio as

where is the dimension of parameter space and is the dimension of parameter space

Note: 计算时,涉及到 Maximum Likelihood Estimator.

  • 假设: ,

  • 统计量:

  • 拒绝域:

2.4 卡方检验 Pearson's test

2.4.1 test of multinomial data

Suppose each individual's category is a multinomial draw with probability .

Let be the number of observed individuals in each category. Then

Let be the simplex, i.e. .

The maximum likelihood estimator (MLE) over all is:

vs

Under and using MLE, we can get the expected number for each category as . Then

  • 假设:,

  • 统计量:

  • 拒绝域:

Note: While we could apply a likelihood ratio test here, Pearson's test has a bit more power.

2.4.2 test of independence

检验两个分类变量是否相互独立。

Suppose we have observed an contingency table.

Col1 Col 2 ... Col c
Row 1 ..
Row 2 ...
... ... ... ... ...
Row r ...
  • 假设:

row and column variables are independent.
row and column variables are dependent.

Under we have following contingency table:

Col1 Col 2 ... Col c
Row 1 ..
Row 2 ...
... ... ... ... ...
Row r ...

The MLEs for are

Then we can get expected number of individuals for each category.

  • 统计量:

  • 拒绝域:
    with

你可能感兴趣的:(假设检验 Hypothesis Testing)