Skip to content

Inference for Regression Slopes

Sampling Distributions for Sample Slopes

Population Least-Squares Regression Line

Theoretical, best-fitting straight line that described the true linear relationship between two variables for an entire population

y^=α+βx\hat{y} = \alpha + \beta x

  • y^\hat{y}: predicted response
  • α\alpha: population y^\hat{y}-intercept
  • β\beta: population slope

Least-squares indicate that the sum of squared residuals are minimized

Sample Least-Squares Regression Line

The line of best fit for a set of data points that minimize the sum of squared residuals

y^=a+bx\hat{y} = a+bx

  • Different samples produce different sample least-square regression lines. These all have different sample sloped bb. This means bb has a sampling distribution.

Mean and Standard Deviation of Sampling Distribution for Sample Slopes

For the sample slopes bb:

  • μ=β\mu = \beta
  • σ=σσxn\sigma = \dfrac{\sigma}{\sigma_x \sqrt{n}}
    • nn: sample size
    • σ\sigma: σ\sigma of all population residuals
    • σx=(xix)2n\sigma_x = \sqrt{\dfrac{\sum (x_i - \overline{x})^2}{n}}: σ\sigma of the xx-values only
  • σσxn\dfrac{\sigma}{\sigma_x \sqrt{n}} is unknown in practice and must be estimated from the sample (standard error)
    • sb=ssxn1s_b = \dfrac{s}{s_x \sqrt{n-1}}: the standard error of sample slopes
    • s=(yiyi^)2n2s = \sqrt{\dfrac{\sum (y_i - \hat{y_i})^2}{n-2}}
      • Divided by n2n-2 as two parameters, α\alpha and β\beta, are estimated
    • sx=(xix)2n1s_x = \sqrt{\dfrac{\sum (x_i - \overline{x})^2}{n-1}}

Sampling Distributions for Standardized Sample Slopes

  • t=bβsbt = \dfrac{b - \beta}{s_b}: standardized sample slope
  • tt-distribution with dof=n2dof=n-2

Hypothesis Tests for Slopes of Regression Lines

Conditions for A t-Test for A Slope

  • Relationship between xx and yy is linear
  • σy\sigma_y cannot vary with xx
  • Residuals are independent
    • Data is collected by random sampling/assignment
    • If sampling without replacement, n<0.1Nn < 0.1N
  • For a given value of xx, yy-values follow an approximate normal distribution
  • If n<30n<30, yy-values distribution have no strong skew and no outliers

Computer Output Table

Predictor Coef SE Coef T P
Constant aa
xx-variable bb sbs_b