Sampling Distributions¶

Sampling Distributions for Sample Means¶

Take all possible samples of size \(n\) from a population and calculate the sample mean \(\overline{x}\) for each, all possible values of the sample mean are calculated

Parameters of Sampling Distributions for Sample Means¶

Mean = \(\mu\)
Standard deviation = \(\dfrac{\sigma}{\sqrt{n}}\)
Standardized \(z\)-score = \(\dfrac{\overline{x} - \mu}{\frac{\sigma}{\sqrt{n}}}\)

Normality of Sampling Distributions for Sample Means¶

If the population is normally distributed, the sampling distribution for sample means is normally distributed

Central Limit Theorem¶

If a population is not normally distributed

A large enough random sample of size \(n \geq 30\) is taken while sample values are independent

Then the sampling distribution for sample means is approximately normally distributed

Sampling Distributions for Differences in Sample Means¶

One-Sample Problem¶

When one random sample of size \(n\) has been taken from one population

Two-Sample Problem¶

If one random sample of size \(n_1\) is taken from one population, then a different random sample of size \(n_2\) is taken from a different population that is independent to the first population

Parameters of Sampling Distribution for Differences in Sample Means¶

Mean = \(\mu_1 - \mu_2\)
Standard deviation = \(\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}\)
- Sampling with replacement or each sample size is less than 10% of the population size
- Otherwise, \(\sigma\) will be smaller
Standardized \(z\)-score = \(\dfrac{(\overline{x_1} - \overline{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}}\)

Normality of Sampling Distributions for Differences in Sample Means¶

If two independent populations are normally distributed, the sampling distribution for differences in sample means is also normally distributed

Sampling Distributions for Sample Proportions¶

Population proportion, \(p\), is the percentage of success
\(n\) is the sample size
\(X\) is the number of successes in a sample, following binomial distribution
Sample proportion \(\hat{p} = \dfrac{X}{n}\)

If

\(np \geq 10\)
\(n(1-p) \geq 10\)

Then \(\hat{p}\)

Normally distributed
Mean = \(p\)
Standard deviation = \(\sqrt{\dfrac{p(1-p)}n}\)
\(z\)-score = \(\dfrac{\hat{p} - p}{{\sqrt{\dfrac{(1-p)}{n}}}}\)

Sampling Distributions for Differences in Sample Proportions¶

If

\(n_1p_1 \geq 10\)
\(n_1(1-p_1) \geq 10\)
\(n_2p_2 \geq 10\)
\(n_2(1-p_2) \geq 10\)

Then \(\hat{p_1} - \hat{p_2}\)

Normally distributed
Mean = \(p_1 - p-2\)
Standard deviation = \(\sqrt{\dfrac{p_1(1-p_1)}{n_1} + \dfrac{p_2(1-p_2)}{n_2}}\)
Standardized \(z\)-score = \(\dfrac{(\hat{p_1} - \hat{p_2}) - (p_1 - p_2)}{\sqrt{\dfrac{p_1(1-p_1)}{n_1} + \dfrac{p_2(1-p_2)}{n_2}}}\)

Biased & Unbiased Estimators¶

Estimator is used to estimate the population parameter

To know if an estimator is a good predictor

All possible estimates from all possible samples of size \(n\) must be generated
Check to see if, on average, estimates are centered around the value of the population parameter
An estimator is said to be unbiased if the mean of its sampling distribution equals the population parameter being estimated

Unbiased Estimators¶

Sample mean \(\overline{x}\)
Sample proportion \(\hat{p}\)
Sample standard deviation \(s\)

Affects on Unbiased Estimators¶

\(n\) does not affect \(\overline{x}\)
Greater \(n\) gives more accurate \(S = \dfrac{\sigma}{\sqrt{n}}\)