Descriptive Statistics: Summarizing a Dataset
Descriptive statistics reduce a list of numbers into a few summary values that describe the data's center, spread, and shape. Every statistic answers a different question about the data:
Mean (Average)
Sum all values and divide by the count. Sensitive to outliers — one extreme value pulls the mean toward it.
Dataset: [2, 3, 5, 7, 100]
Mean = (2+3+5+7+100) / 5 = 23.4
The mean of 23.4 doesn't represent this dataset well — 4 of 5 values are below 10. That's the outlier problem.
Median
The middle value when data is sorted. Not affected by outliers. The better measure of "typical" when data is skewed.
Sorted: [2, 3, 5, 7, 100]
Median = 5 (the 3rd value)
For even counts, the median is the average of the two middle values. The median of [2, 3, 5, 7] is (3+5)/2 = 4.
Mode
The most frequent value. Useful for categorical data and discrete values. A dataset can have no mode, one mode (unimodal), or multiple modes (bimodal/multimodal).
Range
Maximum minus minimum. Simple and intuitive but uses only two data points and is extremely sensitive to outliers.
[2, 3, 5, 7, 100] → Range = 100 − 2 = 98
Variance & Standard Deviation
Measures how spread out the data is around the mean. Variance is in squared units; standard deviation is in the original units (and therefore interpretable). Low SD = data clustered near mean. High SD = data widely dispersed.
For the dataset [2, 3, 5, 7, 100]:
Standard deviation ≈ 38.1 (very high, driven by the outlier 100)
Standard deviation is the square root of variance. For a normal distribution, ~68% of values fall within 1 SD of the mean, ~95% within 2 SD.
Quartiles (Q1, Q2/Median, Q3)
Divide sorted data into four equal parts. Q1 = 25th percentile, Q2 = 50th (median), Q3 = 75th. The interquartile range (IQR = Q3 − Q1) measures spread while ignoring outliers — more robust than range. Values beyond Q3 + 1.5×IQR or below Q1 − 1.5×IQR are considered outliers (Tukey's fences).
When to Use Each Measure
- Mean: Symmetric data with no outliers. Income data, for example, should use median — billionaires pull the mean way up.
- Median: Skewed data or data with outliers. House prices, salaries, load times.
- Standard deviation: When you need to know typical deviation from the mean. Quality control, test scores, financial volatility.
- IQR: When you want a robust measure of spread. Report median + IQR for skewed data.
Calculate Statistics Instantly
Paste a list of numbers into ToolsVito's Statistics Calculator and get the mean, median, mode, range, standard deviation, variance, quartiles, and sum — all computed instantly in your browser.