Glossary

Random Sampling

Random sampling is a probability sampling method in which every member of the population has an equal and independent chance of being selected for the study. It ensures that the sample is representative of the population and eliminates selection bias, providing the statistical...

Definition

Random sampling is a probability sampling method in which every member of the population has an equal and independent chance of being selected for the study. It ensures that the sample is representative of the population and eliminates selection bias, providing the statistical foundation for generalising findings from the sample to the broader population.

Why It Matters

Without random sampling, the characteristics of your sample may systematically differ from those of the population, leading to biased estimates and invalid conclusions. Random sampling underpins the validity of confidence intervals, hypothesis tests, and survey inference. It is the gold standard for survey research, clinical trials, and quality-assurance audits.

Example

A national health agency wants to estimate the prevalence of diabetes in adults. They use a computer-generated random number sequence to select 5,000 individuals from the national population register. Because every adult has an equal chance of selection, the resulting sample closely mirrors the age, sex, and regional distribution of the country. The estimated prevalence of 8.2% with a 95% confidence interval of [7.5%, 8.9%] can be confidently generalised to the entire adult population.

Related Terms

Software Notes

  • SPSS: Random sampling is a data-collection procedure, not an analysis step. For selecting a random subsample from an existing SPSS dataset: Data > Select Cases > Random sample of cases. Specify exact size or percentage.
  • R: sample(population, size = 5000) for simple random sampling without replacement. dplyr::slice_sample(n = 5000) from a data frame. For stratified random sampling: dplyr::group_by(strata) %>% slice_sample(n = 100).
  • Stata: sample 10 selects a 10% random sample. sample 500, count selects exactly 500 cases. For stratified sampling: bysort strata: sample 50, count.