Sampling Methods Bias¶
Introduction to Sampling¶
A population is the set of all the possible elements that could be studied
A sample is a smaller subset of the population chosen to be studied
A sampling frame is a list (or database) of all the elements in the population
When selecting elements from the population to use in a sample
- Sampling with replacement means elements can be selected more than once
- Sampling without replacement means elements can be selected only once
A sample is representative of its population if the elements in it reflects similar patterns and behaviors to the population
One way to make a sample representative is to randomly select its elements from the population
Sampling variability is the natural, expected fluctuation calculated from different random samples drawn from the same population
A census is when you study the entire population
| Pros | Cons | |
|---|---|---|
| Census | - Gives fully accurate results | - Time consuming - Expensive to run - Impractical - May destroy elements (testing single-use machines) |
| Sample | - Quicker - Cheaper - Less data to analyze |
- May not be representative of the population |
Random Sampling Methods¶
Simple Random Sampling¶
A simple random sample (SRS) of size \(n\) is one in which every possible sample of size \(n\) taken from a population has an equal chance of being chosen
Systematic Sampling¶
A systematic sample is one in which samples are selected periodically in regular intervals
A systematic sample is not a type of SRS, as two elements next to each other would never be selected
Stratified Sampling¶
In a stratified sample, the population is first divided into different distinct groups called stratum/strata (pl.).
- The strata cover the whole population but are non-overlapping
- Strata do not need to be of the same size
Elements are chosen from the sample using simple random sampling from each strata in the population - Number of elements can be or not be corresponded to the proportion of the population in that stratum
- If the population can be split into obvious non-overlapping groups then a stratified sample will always be more representative of the population structure than a simple random sample
- However, if the sample size is very small, it may not be worth the time to split into groups
Cluster Sampling¶
In a cluster sample, the population is first divided into groups called clusters
- Different clusters should have similar overall compositions
Then a simple random sample is used to randomly select one ore more clusters out of the possible clusters - All elements in the selected clusters are used in the sample
Compare Stratified & Clusters¶
- Stratified samples are more representative of the population than cluster samples
- Cluster samples can be faster and cheaper than stratified samples
Types of Bias¶
A sample is biased if the method used consistently under-represents or favors certain groups in the population
Biased samples can overestimate or underestimate the property of the population
- Undercoverage bias means some of the population trying to sample either has a lower chance of responding or are fully excluded
- Voluntary response bias means only those who volunteer will be heard
- Nonresponse bias means some of those selected for the sample never give a response
- Response bias is a noticeable trend in the response that suggest they are not answering a question fairly or truthfully
Non-Random/Biased Sampling Methods¶
Quota Sampling¶
Quota sampling is when a population is first divided into distinct groups. It is biased.
Elements are then chosen for the sample from the different groups
- The number of elements from each group in the sample (the quotas) often correspond to the proportion of the population in that group
Elements are chosen from each group for the sample using a non-random method until each quota is filled
Pros:
- Useful with small samples that need to be representative of the population structure
- Quick & Cheap
- Does not require a list of all elements in the population
Cons:
- Non-random sampling method
- Nonresponse bias
- Interviewer may have their own biases
Convenience Sampling/Opportunity Sampling¶
Convenience sampling/Opportunity sampling is when a researcher samples whoever happens to be easily accessible at that particular time and/or space
Researchers decide who to include
Pros:
- Useful with small samples
- Quick & Cheap
- Does not require a list of all elements in the population
Cons:
- Non-random sampling method
- Nonresponse bias
- Interviewer may have their own biases
- Not likely to be representative
Voluntary Response Sampling¶
A voluntary response sample is one in which individuals an be chose/volunteer whether or not to participate
Pros:
- Quick & Cheap
- Does not require a list of all elements in the population
- Good for platforms that have a high volume of users (to compensate for the low response rate)
Cons:
- Non-random sampling method
- Voluntary response bias
- Low response rate