Glossary

Regression Analysis

Regression analysis is a set of statistical methods for modelling the relationship between a dependent variable and one or more independent variables. Linear regression estimates the best-fitting straight line through the data, while logistic regression and other generalised l...

Definition

Regression analysis is a set of statistical methods for modelling the relationship between a dependent variable and one or more independent variables. Linear regression estimates the best-fitting straight line through the data, while logistic regression and other generalised linear models accommodate different types of outcomes.

Why It Matters

Regression is the workhorse of quantitative research. It allows you to control for confounding variables, test theoretical predictions, and estimate the magnitude of effects. Whether you are predicting sales from advertising spend, examining the impact of education on earnings, or modelling patient recovery times, regression provides a flexible and interpretable framework for inference and prediction.

Example

An economist studies the effect of years of education on hourly wages while controlling for age and experience. A multiple linear regression yields a coefficient of 1.8 for education, meaning each additional year of schooling is associated with an estimated £1.80 increase in hourly wages, holding age and experience constant. The 95% confidence interval [1.2, 2.4] indicates the precision of this estimate.

Related Terms

Software Notes

  • SPSS: Analyze > Regression > Linear (for continuous outcomes) or Binary Logistic (for dichotomous outcomes). Check Statistics > Confidence intervals and Collinearity diagnostics.
  • R: lm(y ~ x1 + x2, data = df) for linear regression. summary(model) returns coefficients, standard errors, p-values, and R². For logistic: glm(y ~ x1 + x2, family = binomial, data = df).
  • Stata: regress y x1 x2 for OLS. logit y x1 x2 or probit y x1 x2 for binary outcomes. estat vif checks multicollinearity.