Skip to content

Hypothesis test and confidence interval

Procedures for Hypothesis Test

  1. Define context
    • H0:parameter= ...H_0: parameter =\ ...
    • Ha:parameter condition ...H_a: parameter\ condition\ ...
    • parameterparameter is the ...
    • α= ...\alpha =\ ...
  2. Verify inference conditions
  3. Find p-value
  4. Conclusion
    • p condition αp\ condition\ \alpha
    • H0H_0 should/should not be rejected
    • The data provides sufficient evidence that ...

Procedures for Confidence Interval

  1. Verify confidence interval conditions
  2. Find confidence interval
    • CI=statistic±(critical value)×(standard error of statistic)CI = statistic \pm (critical\ value) \times (standard\ error\ of\ statistic)
  3. We can be C%C\% confident that the interval from lower limitlower\ limit to upper limitupper\ limit captures the actual value of the parameterparameter

Population Proportions

One-sample z-test.

Conditions:

  • Independence condition
    • Random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Approximately normally distributed
    • {np010n(1p0)10\left\{\begin{aligned} np_0 &\geq 10 \\ n(1-p_0) &\geq 10 \end{aligned}\right.

Test statistic:

  • z=p^p0p0(1p0)nz = \dfrac{\hat{p} - p_0}{\sqrt{\dfrac{p_0 (1-p_0)}{n}}}
  • standard error = p0(1p0)n\sqrt{\dfrac{p_0 (1-p_0)}{n}}

Differences in Population Proportions

Two-sample z-test .

Conditions:

  • Independence condition
    • Random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Approximately normally distributed
    • Combined proportion/pooled proportion p^c=X1+X2n1+n2\hat{p}_c = \dfrac{X_1 + X_2}{n_1 + n_2}, X=np^X = n \hat{p}
    • {n1p^c10n1(1p^c)10n2p^c10n2(1p^c)10\left\{\begin{aligned} n_1\hat{p}_c &\geq 10 \\ n_1(1-\hat{p}_c) &\geq 10 \\ n_2\hat{p}_c &\geq 10 \\ n_2(1-\hat{p}_c) &\geq 10 \end{aligned}\right.

Test statistic:

  • z=(p^1p^2)0p^1(1p^1)n1+p^2(1p^2)n2z = \dfrac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}}
  • standard error = p^1(1p^1)n1+p^2(1p^2)n2\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}

Population Means

One-sample t-test.

Conditions:

  • Independence condition
    • Random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Approximately normally distributed
    • Approximately symmetric
    • No outliers
  • If very skewed, n30n \geq 30

Test statistic:

  • t=xμsnt = \dfrac{\overline{x} - \mu}{\frac{s}{\sqrt{n}}}
  • standard error = sn\dfrac{s}{\sqrt{n}}

Differences in Population Means

Two-sample t-test.

Conditions:

  • Independence condition
    • Random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Approximately normally distributed
    • Approximately symmetric
    • No outliers
  • If very skewed, n30n \geq 30

Test statistic:

  • t=x1x2s12n1+s22n2t = \dfrac{\overline{x}_1 - \overline{x}_2}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}}
  • standard error = s12n1+s22n2\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}

Differences in Matched Pairs

One-sample t-test.

Conditions:

  • Two measures come from the same items within the population
  • Independence condition
    • Random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Approximately normally distributed
    • Approximately symmetric
    • No outliers
  • If very skewed, n30n \geq 30

Test statistic:

  • t=xdμdsdnt = \dfrac{\overline{x}_d - \mu_d}{\frac{s_d}{\sqrt{n}}}
  • standard error = sdn\dfrac{s_d}{\sqrt{n}}

Goodness of Fit

χ2\chi^2-test.

Conditions:

  • Independence condition
    • Random sampling
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Large counts condition
    • Each expected value ≥ 5

Test statistic:

  • χ2=(observedexpected)2expected\chi^2 = \sum \dfrac{(observed - expected)^2}{expected}

Independence

One-sample χ2\chi^2-test.

Conditions:

  • Independence condition
    • Random sampling
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Large counts condition
    • Each expected value ≥ 5
    • or ≥ 80\% expected values > 5 and all are ≥ 1

Test statistic:

  • χ2=(observedexpected)2expected\chi^2 = \sum \dfrac{(observed - expected)^2}{expected}
  • Expected valueExpected\ value = Row total×Column totalTotal number in sample\dfrac{Row\ total \times Column\ total}{Total\ number\ in\ sample}
  • dof=(nrow1)(ncol1)dof = (n_{row}-1)(n_{col}-1)

Homogeneity

Multi-sample χ2\chi^2-test.

Conditions:

  • Independence condition
    • Random sampling
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • Large counts condition
    • Each expected value ≥ 5

Test statistic:

  • χ2=(observedexpected)2expected\chi^2 = \sum \dfrac{(observed - expected)^2}{expected}
  • dof=(nrow1)(ncol1)dof = (n_{row}-1)(n_{col}-1)

Regression Line

One-sample t-test.

Conditions:

  • Relationship between xx and yy is linear
  • σy\sigma_y cannot vary with xx
  • Independence condition
    • Random sampling
    • If sampling without replacement, n<0.1Nn < 0.1 N
  • For a given value of xx, yy-values follow an approximate normal distribution
  • If n<30n<30, yy-values distribution have no strong skew and no outliers

Test statistic:

  • t=bβsbt = \dfrac{b - \beta}{s_b}
  • standard error sb=ssxn1s_b = \dfrac{s}{s_x \sqrt{n-1}}
    • s=(yiyi^)2n2s = \sqrt{\dfrac{\sum (y_i - \hat{y_i})^2}{n-2}}
    • sx=(xix)2n1s_x = \sqrt{\dfrac{\sum (x_i - \overline{x})^2}{n-1}}
  • tt-distribution with dof=n2dof=n-2
Predictor Coef SE Coef T P
Constant aa
xx-variable bb sbs_b