Sample Size and Power: What Every Researcher Should Know

What Is Statistical Power?

Statistical power is the probability that your study will detect a true effect when one exists. Formally, power = 1 − β, where β is the probability of a Type II error (failing to detect a real effect).

A study with 80% power has a 20% chance of missing a genuine effect. A study with 50% power — distressingly common in published research — is essentially a coin flip. Underpowered studies produce unreliable results, inflate effect size estimates, and contribute to the replication crisis.

The conventional minimum power threshold is 80% (β = 0.20), though many funding bodies and journals now expect 90% power for primary outcomes.

The Four Factors That Determine Sample Size

Sample size calculations depend on four interacting factors. Change any one and the required sample size changes:

1. Effect size — The magnitude of the effect you expect or consider meaningful. Larger effects require smaller samples. If you expect a large effect (Cohen's d = 0.8), you may need only 26 participants per group. A small effect (d = 0.2) might require 394 per group.

2. Significance level (α) — The threshold for declaring a result statistically significant. The conventional standard is α = 0.05 (two-tailed). A stricter threshold (α = 0.01) requires a larger sample.

3. Power (1 − β) — The desired probability of detecting a true effect. Higher power demands larger samples. Going from 80% to 90% power typically increases the required sample size by about 30%.

4. Variability — The spread of the data. More variable data (larger standard deviations) require larger samples. This is why pilot studies are valuable — they estimate the variability in your data.

A Priori vs Post Hoc Power

A priori power analysis is conducted before data collection to determine the required sample size. This is the gold standard and is required by most ethics committees and funders.

Post hoc (observed) power is calculated after the study using the observed effect size. This is problematic and widely discouraged. If your study found a non-significant result, post hoc power will inevitably be low — it tells you nothing beyond what the p-value already tells you. Instead of post hoc power, report the effect size and its confidence interval.

Practical Steps for Sample Size Calculation

Step 1: Define your primary outcome and test. Your sample size calculation should be based on the primary hypothesis, not secondary outcomes.

Step 2: Choose a meaningful effect size. Use one of three approaches:

Literature-based: Use the effect size from a similar published study. Be cautious — published effects tend to be inflated.
Pilot study: Conduct a small pilot and use the observed effect size. Adjust upward because pilot estimates are imprecise.
Minimum clinically important difference (MCID): Define the smallest effect that would matter in practice. This is the most defensible approach and the one preferred by regulators and funders.

Step 3: Select α and power. The standard choices are α = 0.05 and power = 0.80, but adjust if your context demands it.

Step 4: Use a sample size calculator or statistical software. Our Sample Size Calculator and Power Analysis Calculator handle the most common study designs. For more complex designs, use G*Power (free), R (pwr package), or Stata (power command).

Step 5: Adjust for practical considerations. Add participants to account for:

Attrition: If you expect 20% dropout, recruit 25% more participants.
Clustering: Cluster-randomised trials need larger samples (use design effect = 1 + [m-1]×ICC).
Multiple comparisons: If you test multiple outcomes, apply a correction and adjust the sample size accordingly.

Common Study Designs and Their Sample Size Requirements

For two independent groups (two-tailed t-test, α = 0.05, power = 0.80):

Effect Size (Cohen's d)	Sample Size per Group	Total N
Small (0.2)	394	788
Medium (0.5)	64	128
Large (0.8)	26	52

For one-way ANOVA with 3 groups (α = 0.05, power = 0.80):

Effect Size (f)	Sample Size per Group	Total N
Small (0.10)	322	966
Medium (0.25)	52	156
Large (0.40)	21	63

For correlation (two-tailed, α = 0.05, power = 0.80):

Effect Size (r)	Required N
Small (0.10)	783
Medium (0.30)	85
Large (0.50)	29

The Ethics of Underpowered Studies

Underpowered studies are not just statistically weak — they are ethically problematic. Participants are exposed to the burdens of research (time, inconvenience, risk) without a reasonable chance of producing meaningful results. Funding bodies, ethics committees, and journals increasingly require power analyses as part of study protocols.

If your power analysis reveals that the required sample size is beyond your resources, consider: collaborating with other sites to pool participants, using a within-subjects design (which typically requires fewer participants), narrowing your research question to a larger expected effect, or adjusting your significance level for a pilot study.

How We Can Help

Our power analysis service takes the guesswork out of sample size planning. We calculate the minimum sample size for your study design, choose defensible effect size estimates, and produce a power analysis report suitable for ethics committees and funders.

[Calculate Your Sample Size → /resources/calculators/sample-size-calculator/]

[Get a Power Analysis Consultation → /contact/]