Statistics is a branch of mathematics that helps us understand and analyze data by summarizing large sets of information into meaningful values. When you collect numbers, such as marks scored in exams, daily expenses in INR, heights of students in centimeters, or ages of people, it can be difficult to interpret the entire collection directly. To make sense of such data, we use statistical values which provide an overview or a summary. Among these, the most important are the mean, median, and mode. These values help us describe the 'center' or typical value of the dataset, which is often called the measure of central tendency.
Understanding and calculating these measures is crucial for many competitive exams as they frequently appear in questions involving data interpretation, analysis, and averages related to metric measurements, currency (INR), or other numerical contexts.
The mean, commonly known as the average, is the sum of all the data points divided by the number of data points. It gives us a single value representing the overall level of the data.
For example, if a student spends INR 100, 150, 120, 130, 110, 90, and 140 over seven days respectively, the mean expense tells us the typical daily spending.
Why is mean useful? It tells us what the 'typical' or 'central' value is among all those numbers, making it easier to understand the data as a whole rather than individually.
There are two main types of means you should know:
The median is the middle value in a data set arranged in ascending order. It divides the data into two equal halves. If the total number of data points is odd, the median is the exact center element. If even, it is the average of the two middle elements.
Why median matters: The median provides a better measure of central tendency when data contains extreme values or outliers. For example, if one person earns much more than others, the mean might be pulled higher, but the median reflects the middle income more fairly.
The mode of a data set is the value that occurs most frequently. There can be:
For example, if the test scores of students are 60, 70, 70, 80, 90, 90, and 90, the mode is 90 because it appears most frequently.
| Score | Frequency |
|---|---|
| 60 | 1 |
| 70 | 2 |
| 80 | 1 |
| 90 | 3 |
Why mode is useful: It helps identify the most common or popular item in a data set, which can be important in fields like marketing, quality control, and more.
Mean is the arithmetic average, best for data without extreme values.
Median is the middle value, robust against outliers.
Mode is the most frequent value, useful to find popularity or commonness.
Step 1: Add all expense values:
120 + 150 + 100 + 130 + 110 + 90 + 140 = 840
Step 2: Count the number of days (data points), \(n = 7\).
Step 3: Calculate the mean:
\(\bar{x} = \frac{840}{7} = 120\)
Answer: The mean daily expenditure is INR 120.
Step 1: Arrange heights in ascending order:
145, 150, 152, 155, 158, 160, 165
Step 2: Since the number of terms \(n = 7\) (odd), median position is at \(\frac{n+1}{2} = \frac{8}{2} = 4\).
Step 3: The 4th value in the ordered list is 155.
Answer: The median height is 155 cm.
Step 1: Calculate frequency of each score:
| Score | Frequency |
|---|---|
| 60 | 1 |
| 75 | 3 |
| 80 | 2 |
| 85 | 1 |
| 90 | 3 |
Step 2: Identify highest frequency: Both 75 and 90 appear 3 times each.
Answer: The data is bimodal with modes 75 and 90.
Step 1: Identify the weights and prices:
Weight 1 \(w_1 = 30\) kg, price 1 \(x_1 = 500\) INR/kg
Weight 2 \(w_2 = 20\) kg, price 2 \(x_2 = 600\) INR/kg
Step 2: Calculate weighted mean:
\[ \bar{x} = \frac{w_1 x_1 + w_2 x_2}{w_1 + w_2} = \frac{30 \times 500 + 20 \times 600}{30 + 20} = \frac{15000 + 12000}{50} = \frac{27000}{50} = 540 \]
Answer: The average price per kg of the mixture is INR 540.
| Age Group (years) | Frequency |
|---|---|
| 20 - 29 | 8 |
| 30 - 39 | 12 |
| 40 - 49 | 15 |
| 50 - 59 | 10 |
| 60 - 69 | 5 |
Step 1: Find total frequency \(N = 50\).
Step 2: Calculate cumulative frequencies:
| Age Group | Frequency (f) | Cumulative Frequency (CF) |
|---|---|---|
| 20 - 29 | 8 | 8 |
| 30 - 39 | 12 | 20 |
| 40 - 49 | 15 | 35 |
| 50 - 59 | 10 | 45 |
| 60 - 69 | 5 | 50 |
Step 3: Find median class; half of \(N\) is \(\frac{50}{2} = 25\).
Cumulative frequency just greater than or equal to 25 is 35, corresponding to the class 40-49 (median class).
Set the variables:
Step 4: Apply formula:
\[ \text{Median} = 40 + \left( \frac{25 - 20}{15} \right) \times 10 = 40 + \left( \frac{5}{15} \right) \times 10 = 40 + \frac{50}{15} = 40 + 3.33 = 43.33 \]
Step 5: Identify modal class (class with highest frequency), which is 40-49 with frequency 15.
Neighbor frequencies:
Step 6: Apply mode formula:
\[ \text{Mode} = 40 + \left( \frac{15 - 12}{2 \times 15 - 12 - 10} \right) \times 10 = 40 + \left( \frac{3}{30 - 22} \right) \times 10 = 40 + \left( \frac{3}{8} \right) \times 10 = 40 + 3.75 = 43.75 \]
Answer: The estimated median age is approximately 43.33 years, and the mode age is approximately 43.75 years.
When to use: During timed exams with odd number datasets.
When to use: When data has frequent repetitions and you need the most common value immediately.
When to use: Efficiency in solving grouped data problems.
When to use: Mixture problems or when data points have different impacts.
When to use: To avoid mistakes in data sets with repeated but equal frequencies.
Progress tracking is paywalled — subscribe to mark subtopics as understood and save your streak.
Go to practice →