Glossary
Histogram
A histogram is a graphical representation of the distribution of a continuous variable, constructed by dividing the data range into bins and plotting the frequency of observations in each bin as adjacent rectangles.
Definition
A histogram is a graphical representation of the distribution of a continuous variable, constructed by dividing the data range into bins and plotting the frequency of observations in each bin as adjacent rectangles.
Why It Matters
Histograms are one of the first and most informative tools in exploratory data analysis. They reveal the shape of a distribution — whether it is symmetric, skewed, unimodal, or multimodal — and help detect outliers and unusual patterns. Unlike box plots, histograms preserve information about distributional shape, making them complementary tools. The choice of bin width can significantly affect interpretation, which is why many practitioners use density plots as a smoother alternative or overlay multiple histograms for group comparison.
Example
A data analyst examining the distribution of household incomes in a city plots a histogram with 20 bins. The resulting display is right-skewed, with a long tail extending toward high incomes, indicating that most households earn below the mean. A normal quantile plot overlay confirms the departure from normality, suggesting that median-based summaries and non-parametric tests are more appropriate than means and t-tests.
Related Terms
Software Notes
- SPSS: Graphs > Chart Builder > Histogram; drag variable to x-axis and adjust bin width under Element Properties
- R:
hist(x, breaks = 20, main = "Distribution", xlab = "Value", col = "steelblue") - Stata:
histogram varname, bin(20) frequency title("Distribution")
Contact Us for Support → /contact/