Measures of Central Tendency: Mean, Median, and Mode
Measures of central tendency are statistical tools used to describe the center or typical value of a dataset. The three most commonly used measures are Mean, Median, and Mode. These measures help summarize large datasets and provide insights into the overall distribution of the data.
1. Mean (Arithmetic Mean)
Definition
The mean, or arithmetic mean, is the sum of all values in a dataset divided by the total number of values. It is the most commonly used measure of central tendency and represents the average value of the data.
Formula
For a dataset with n values: Mean(Xˉ)=∑XinMean (\bar{X}) = \frac{\sum X_i}{n}
Where:
- ∑Xi\sum X_i = Sum of all values in the dataset
- nn = Number of observations (data points)
Example
Consider the following dataset representing the ages of five students:
Ages: 18, 20, 22, 24, 26
Step 1: Sum of all values 18+20+22+24+26=11018 + 20 + 22 + 24 + 26 = 110
Step 2: Divide by the number of values Mean=1105=22Mean = \frac{110}{5} = 22
So, the mean age is 22 years.
Types of Mean
1.1 Weighted Mean
A weighted mean assigns different weights to values, useful when some values contribute more to the dataset than others. Weighted Mean=∑(Xi×Wi)∑WiWeighted\ Mean = \frac{\sum (X_i \times W_i)}{\sum W_i}
Where:
- XiX_i = Data values
- WiW_i = Weights assigned to each value
Example of Weighted Mean
Suppose a student’s grades in three subjects are:
Subject | Grade | Weight (Credits) |
---|---|---|
Math | 80 | 3 |
Science | 90 | 4 |
English | 85 | 2 |
Weighted Mean=(80×3)+(90×4)+(85×2)3+4+2Weighted\ Mean = \frac{(80 \times 3) + (90 \times 4) + (85 \times 2)}{3 + 4 + 2} =240+360+1709=7709=85.56= \frac{240 + 360 + 170}{9} = \frac{770}{9} = 85.56
So, the student’s weighted average score is 85.56.
Advantages of Mean
✅ Easy to calculate and understand.
✅ Uses all values in the dataset, making it a good overall indicator.
✅ Widely used in finance, economics, and research.
Disadvantages of Mean
❌ Affected by outliers (extreme values).
❌ Not always a good measure if data is skewed.
❌ Not suitable for categorical data (e.g., gender, colors).
2. Median
Definition
The median is the middle value in an ordered dataset. If the dataset has an even number of observations, the median is the average of the two middle values.
The median is useful when data contains outliers or is skewed, as it is less affected by extreme values.
Steps to Find the Median
Case 1: Odd Number of Observations
- Arrange the data in ascending order.
- Find the middle value.
Example:
Dataset: 10, 15, 20, 25, 30
- The middle value is 20 (since it is the 3rd number out of 5).
- So, the median = 20.
Case 2: Even Number of Observations
- Arrange the data in ascending order.
- Take the two middle values and find their average.
Example:
Dataset: 10, 15, 20, 25, 30, 35
- Middle values: 20, 25
- Median = (20 + 25) / 2 = 22.5
Advantages of Median
✅ Not affected by outliers or extreme values.
✅ Useful when data is skewed (e.g., income levels).
✅ Works for both numerical and ordinal data (e.g., ratings, rankings).
Disadvantages of Median
❌ Does not use all values in the dataset.
❌ Cannot be used for further mathematical calculations.
❌ May not be unique if multiple middle values exist.
3. Mode
Definition
The mode is the most frequently occurring value in a dataset. Unlike the mean and median, which are based on numerical calculations, the mode is simply the most common data point.
Types of Mode
1. Unimodal Distribution
- A dataset with one mode.
- Example: 3, 5, 7, 7, 8, 10 → Mode = 7 (appears twice).
2. Bimodal Distribution
- A dataset with two modes.
- Example: 2, 4, 6, 6, 8, 8, 10 → Modes = 6 and 8.
3. Multimodal Distribution
- A dataset with more than two modes.
- Example: 1, 2, 2, 3, 3, 4, 4 → Modes = 2, 3, and 4.
4. No Mode
- When no value repeats in the dataset.
- Example: 2, 5, 8, 11, 14 → No mode.
Example of Mode
Dataset: 5, 7, 9, 9, 10, 11, 9, 12, 7, 7
Step 1: Count the frequency of each value
- 5 appears once
- 7 appears three times
- 9 appears three times
- 10, 11, 12 appear once
Step 2: Identify the most frequent value
- Modes: 7 and 9 (bimodal dataset).
Advantages of Mode
✅ Can be used for categorical data (e.g., most popular brand, most chosen survey option).
✅ Not affected by extreme values.
✅ Works well for non-numeric data.
Disadvantages of Mode
❌ Sometimes, no mode exists.
❌ Cannot be used for mathematical calculations like mean.
❌ May not accurately represent the data if multiple modes exist.
Comparison: Mean vs. Median vs. Mode
Feature | Mean | Median | Mode |
---|---|---|---|
Definition | Sum of all values divided by count | Middle value of an ordered dataset | Most frequently occurring value |
Affected by Outliers? | Yes | No | No |
Best for? | Symmetrical data | Skewed data | Categorical data |
Used in? | Continuous data (salary, temperature) | Skewed data (house prices, incomes) | Surveys, categorical data |