Maths
Inter-quartile range, cumulative frequency, box and whisker plots - Higher
If you are studying the higher paper you will need to know the difference between discrete and continuous data, how to plot and interpret histograms, how to calculate inter-quartile ranges, cumulative frequency and box and whisker plots.
Raw data is the information we get when we do a survey. For example, we might have a list of heights or shoe sizes.
Data can either be discrete or continuous.
This data set shows a group of discrete data.
This is called discrete data because the units of measurement (for example, CDs) cannot be split up; there is nothing between 1 CD and 2 CDs.
| Music format | Number sold |
|---|---|
| CD albums | 140 |
| CD singles | 70 |
| Downloads | 55 |
| Vinyl | 5 |
| Total sales | 270 |

Shoe sizes are a classic example of discrete data, because sizes 39 and 40 mean something, but size 39.2, for example, does not.
The data set shows a group of continuous data.
This data is called continuous because the scale of measurement - distance - has meaning at all points between the numbers given, eg we can travel a distance of 1.2 and 1.8 miles.
| Distance in miles | 0.1 0.2 0.6 1.1 1.2 1.8 2.0 2.7 3.4 4.6 6.2 8.0 12.1 14.2 |
For each question decide whether the datat set is discrete or continuous.
The heights of pupils in class 3A.
Height is continuous. For example, a pupil could be 152.3cm.
The number of chocolates in various 500g boxes.
The number of chocolates is discrete. There would not be half chocolates in a box.

The times taken for athletes to run 100m.
Time is continuous. For example, an athlete may run 100m in 10.37 seconds.
It is often better to display data in a table. This section will look at different ways to organise data, and revise the following terms:
The table shows the number of people on 12 different buses. Discrete data is normally grouped in the following way:
| Number of people on a bus | 4-6 | 7-9 | 10-12 | 13-15 |
|---|---|---|---|---|
| Frequency | 2 | 7 | 2 | 1 |
From the table we can see that there were 7 buses with 7-9 people on them. But we have no way of telling exactly how many people were on each bus.
Now look at the class widths: they are all 3.

The midpoints of the classes are 5, 8, 11 and 14, as shown in red.
There are many ways to represent continuous data in a table.
Example 1
This table showd the heights (h) of 25 people. The class widths are all 10.
| 120 | 130 | 140 | 150 |
|---|---|---|---|---|
| Frequency | 4 | 6 | 10 | 5 |

The class boundaries are 120, 130, 140, 150 and 160.
The midpoints of the classes are 125, 135, 145 and 155, shown in red.
Example 2
| Length (cm) | 110-129 | 130-149 | 150-169 | 170-189 |
|---|---|---|---|---|
| Frequency | 5 | 3 | 1 | 1 |

Example 3
| Height (cm) | 2-4 | 5-7 | 8-10 | 11-14 |
|---|---|---|---|---|
| Frequency | 7 | 6 | 2 | 5 |

Example 4
| Age (years) | 21-30 | 31-40 | 41-50 | 51-60 |
|---|---|---|---|---|
| Frequency | 8 | 10 | 3 | 4 |
This table looks very similar to the table in example 2, and you might assume that the class boundaries are 20.5, 30.5 etc - but they are not! Remember that if you are 16 now, you will be 16 right up until your 17th birthday.

The following table shows the ages of 25 children on a school bus:
| Age | Frequency |
|---|---|
| 5-10 | 6 |
| 11-15 | 15 |
| 16-17 | 4 |
| > 17 | 0 |
If we are going to draw a histogram to represent the data, we first need to find the class boundaries. In this case they are 5, 11, 16 and 18. The class widths are therefore 6, 5 and 2.
The area of a histogram represents the frequency.
The areas of our bars should therefore be 6, 15 and 4.

Remember that in a bar chart the height of the bar represents the frequency. It is therefore correct to label the vertical axis 'frequency'.
However, as in a histogram, it is the area which represents the frequency.
It would therefore be incorrect to label the vertical axis 'frequency' and the label should be 'frequency density'.
Frequency density = frequency ÷ class width
Apply this formula to the following question.
The ages of children entering a theme park in a 1-hour period are recorded in the table:
| Age | Frequency |
|---|---|
| 0-3 | 12 |
| 4-10 | 14 |
| 11-18 | 48 |
| >18 | 0 |
Find the class widths and frequency densities. Then draw a histogram to represent the data.
Frequency densities:
12/4 = 3
14/7 = 2
48/8 = 6
The histogram should look like this:

We know that the median divides the data into two halves. We also know that for a set of n ordered numbers the median is the (n + 1) ÷ 2 th value.
Similarly, the lower quartile divides the bottom half of the data into two halves, and the upper quartile also divides the upper half of the data into two halves.
Lower quartile is the (n + 1) ÷ 4 th value.
Upper quartile is the 3 (n + 1) ÷ 4 th th value.

Find the median, lower quartile and upper quartile for the following data:
11, 4, 6, 8, 3, 10, 8, 10, 4, 12 and 31.
Ordering the data, we get 3, 4, 4, 6, 8, 8,10, 10, 11, 12 and 31.
The median is the (11 + 1) ÷ 2 = 6th value.
The lower quartile is the (11 + 1) ÷ 4 = 3rd value.
The upper quartile is the 3 (11 + 1) ÷ 4 = 9th value.
Therefore, the median is 8, the lower quartile is 4, and the upper quartile is 11.
3, 4, 4, 6, 8, 8, 10, 10, 11, 12, 31
The interquartile range is the difference between the upper quartile and lower quartile.
In this example, the interquartile range is 11 - 4 = 7.

A survey was carried out to find the number of pets owned by each child in a class.
The results are shown in the table:
| Number of pets | Frequency |
|---|---|
| 0 | 3 |
| 1 | 5 |
| 2 | 2 |
| 3 | 7 |
| 4 | 10 |
| 5 | 3 |
| 6 | 1 |
| >6 | 0 |
Find the interquartile range.
3.
Remember that there is a total of 31 children in the class.
Note that the interquartile range ignores extreme values. The range includes extreme values.
In cases such as these, it is often preferable to use the interquartile range when comparing the data.
The cumulative frequency is obtained by adding up the frequencies as you go along, to give a 'running total'.
The table shows the lengths (in cm) of 32 cucumbers.
Before drawing the cumulative frequency diagram, we need to work out the cumulative frequencies. This is done by adding the frequencies in turn.
| Length | Frequency | Cumulative Frequency |
|---|---|---|
| 21-24 | 3 | 3 |
| 25-28 | 7 | 10 (= 3 + 7) |
| 29-32 | 12 | 22 (= 3 + 7 + 12) |
| 33-36 | 6 | 28 (= 3 + 7 + 12 + 6) |
| 37-40 | 4 | 32 (= 3 + 7 + 12 + 6 + 4) |
The points are plotted at the upper class boundary. In this example, the upper class boundaries are 24.5, 28.5, 32.5, 36.5 and 40.5. Cumulative frequency is plotted on the vertical axis.

There are no values below 20.5cm.
Cumulative frequency graphs are always plotted using the highest value in each group of data, and the cumulative frequency is always plotted up a graph, never across.
Cumulative frequency diagrams usually have this characteristic S-shape.
When looking at a cumulative frequency curve, you will need to know how to find its median, lower and upper quartiles, and the interquartile range.
By drawing horizontal lines to represent 14 of the total frequency, 12 of the total frequency and 34 of the total frequency, we can read estimates of the lower quartile, median and upper quartile from the horizontal axis.

Quartiles are associated with quarters. The interquartile range is the difference between the lower and upper quartile.
From these values, we can also estimate the interquartile range: 34 - 28 = 6
Remember to use the total frequency, not the maximum value, on the vertical axis. The values are always read from the horizontal axis.
A box and whisker plot is used to display information about the range, the median and the quartiles. It is usually drawn alongside a number line, as shown:

The oldest person in Mathsminster is 90. The youngest person is 15.
The median age of the residents is 44, the lower quartile is 25, and the upper quartile is 67.
Represent this information with a box-and-whisker plot.

This table shows the number of visitors to a seaside town:
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
| Visitors (0000) | 14 | 24 | 9 | 8 | 12 | 22 | 11 | 7 |
|
|
|
|---|---|---|
| Visitors (0000) | 11 | 20 |
If this information is plotted on a graph, it looks like this:

This shows that there is a wide variation in the number of visitors depending on the season. There are far less in the autumn and winter than spring and summer.
However, if we wanted to see a trend in the number of visitors, we could calculate a 4-point moving average.
We do this by finding the average number of visitors in the four quarters of 2005:
![]()
Then we find the average number of visitors in the last three quarters of 2005 and first quarter of 2006:
![]()
Then the last two quarters of 2005 and the first two quarters of 2006:
![]()
And so on…
Note that the last average we can find is for the last two quarters of 2006 and the first two quarters of 2007.
We plot the moving averages on a graph, making sure that each average is plotted at the centre of the four quarters it covers:

We can now see that there is a very slight downward trend in visitors.
Now try a Test Bite