Why are trails and voice overs so loud?
John Heraty outlines the challenge of controlling loudness while retaining the dramatic impact of a programme
With a few exceptions, such as Formula 1 racing and The Voice, BBC One TV programmes are largely quiet by comparison to BBC trailers. Programmes also vary in their average loudness when compared to each other, which means the audience often have to manually alter their volume control whenever one programme finishes and another starts.
The need for new technology
The problem has many causes, but at the root of it is the way we measure audio. The broadcast industry in the UK has historically used the PPM (Peak Programme Meter), a small black faced, green and red needled (and white and yellow) meter that was introduced in to studios around 80 years ago. Theses meters were designed for monitoring the feed to AM radio transmitters and later on AM sound that went with 405 line TV.
They indicate the audios quasi-peak (nearly, not-quite-right-but-near-enough) and their slow decay time allows the eye to track the needle giving the operator a clue to how close they are to overload and potential distortion.
However, the PPM does not indicate how loud we perceive sound to be, nor does it give a totally accurate reading of how close to overload a signal source is. When FM, with its greater audio frequency range, went on air 60 years ago, PPM technology would ideally have been updated or replaced, but there was simply nothing that could take its place. In a studio the job of the SM (studio manager), the sound recordist on location, the mix engineer in the dubbing theatre and the self op-DJ was twofold. They must mix audio sources so the loudness of each source is appropriately placed in the overall programme mix, and also ensure that at no point should this mix overload the signal chain between the studio and the listener’s receiver.
The operator in presentation or continuity has to ensure that each programme’s and trail’s loudness is the same and keep the signal path free from distortion by avoiding overload.
The digital switchover and the removal of analogue TV services have shown the limitations of using the PPM. Using HE-AAC+ on Freeview HD, MPEG1 Layer 2 on Freeview SD and Dolby Digital on Freesat the BBC can now send audio with enormous dynamic range to home TVs - but herein lies the problem.
From programme-to-programme producers are exploiting the large dynamic range by placing the average loudness of a programme very low down the available level range, thus allowing loud dramatic events all the room they need without distorting the large signal chain.
This is a wonderful approach as it adds drama and impact. However, other programme makers are taking a different approach, they are reducing the dynamic range, moving the average loudness very close to the peak allowable signal, thus producing very in-your-face, in-the-midst-of-the-action feel to their programme’s.
Both options have their benefits, but also a major disadvantage. The graph below shows a Friday night schedule on BBC One from April last year. The black line indicates the average loudness, which should ideally be straight without breaks. Each time it breaks, the viewer perceives a change in volume and picks up the remote control to adjust the volume. It is obvious from the graph that each programme, insert, and opt-out has a different loudness.
If all programmes were made to the same target loudness, this would allow all shows to have the style of sound they want – and would also address around 40% of the BBC’s current complaints.
To remove the need for viewers to pick up the remote and adjust the volume, broadcasters can use a loudness meter for the programme mix. Loudness measurement takes an average of the signal’s energy content, rather than the peak voltage as the PPM does. The average measurements can be taken over short time periods for assessing individual contributions to a mix, but also over much longer time periods that can be used for assessing trailers, whole programmes or entire schedules.
Loudness is subtly different to the area of audibility (which accounts for the remaining 60% of our audio complaints) where the clarity of dialogue, the levels of sound mixes and background sounds need to be considered, though there may be some slight link between loudness and audibility.
For the past few years, technical standards organisations around the world (including the European Broadcasting Union) have been working on a series of documents which describe the principles of loudness measurement, as opposed to peak measurement, and how to use loudness and true peak measurement in productions. The EBU’s documents are published under the banner “Technical Position R.128”. And as of October of 2014 the UK’s Digital Production Partnership requires all UK TV productions to comply with the R.128 recommendations.
Governments and broadcasters across the world are addressing this problem with legislation. In the USA the law is called the CALM act; France and Spain have also passed laws for the control of all their broadcast channels’ loudness, using the technical and operational framework of R.128. Germany, Switzerland, Austria and Norway have also voluntarily implemented R.128 recommendations across all TV outlets. Austria recently announced that they have reduced their loudness complaints to…ZERO. We too in the UK favour the self-regulatory approach.
New units are used to describe loudness, which are referenced to the familiar dBFS system (dB Full Scale) where 0dBFS equals all bits within the sample effectively set to 1.
However, because of the frequency pre-emphasis applied to the input, and the averaging carried out by the meter, the new units cannot be directly correlated to dBFS unless the signal measured is tone. So when working with loudness meters, broadcasters should not try to equate their readings of programme audio to a PPM reading.
The units are called LUFS or Loudness Units relative to Full Scale. The loudness meter is designed to read 0LUFS on stereo 1kHz tone that peaks at 0dBFS. It’s also the case that a change in gain of 1dB applied to the signal will give a change in loudness of 1LU (of course, this relationship between LUFS and dBFS is simple when working with tone, and don’t forget that the integration period and pre-emphasis of the meter have a big influence on the meter reading.)
Below is a comparison on the used scales and units.
The R.128 specification requires a target loudness of -23LUFS, this is a bit awkward to describe and is going to be the new equivalent to PPM4 so there’s an alternative scale and unit called the LU or Loudness Unit where zero LU equals the target loudness (-23LUFS in this case) and all values of LU above and below zero LU are added or subtracted from -23LUFS. Eg -3LU equals -26LUFS.
How the new meters work
The human ear takes very little time to react to changes in amplitude, but the brain does not register a change in “loudness” unless the amplitude change persists for at least 400mS. The ear-brain combination does not register all frequencies in the audible spectrum equally. For example, bass LF requires more energy at the ear drum if it is to be perceived as being equally loud as mid-range speech frequencies. Because of this phenomenon, the input signal to the loudness meter has its spectrum pre-emphasised to broadly match the spectral response of the ear-brain combination.
The filter response curve is shown above and is given the label “K” (not to be confused with the A-weighting filter used for noise measurements).
The PPM and the VU meter both convert the input signal to a DC value by rectification. This is peak detection, and in both cases the DC voltage is then smoothed, either by electronics or a combination of electronics and the physical mechanical characteristics of the meter (in particular the PPM). The ear measures energy rather than peaks, so the loudness meter uses RMS (Root Mean Square) detection to determine the signal’s amplitude.
Next follows the integrator, which is where CALM and R.128 begin to differ in their approaches.
R.128, which will be in use in all UK productions by October 2014 specifies three integration times. The integrator can be set to one of three values M, S and I. M (momentary) integrates the RMS signal over a 400mS period (the average human reaction time when applied to loudness changes). S (short) uses a 3 second integration time period, which can be used for monitoring during rehearsals or, if live, is a good indicator as to whether or not the audio is in the target area. Finally I (infinite or sometimes gated) uses the gate to exclude quiet background passages from the measurement.
The longer “I” integration period is produced by using the 400mS period and then starting and stopping the integration under the control of the gate. Shorter non-gated periods are used for initial checking of voices and spot effects for setting loudness in a mix. EBU-based R.128 loudness meters are gated, whereas American CALM-based loudness meters are non-gated.
The gate stops and starts the 400ms integration process, it does this under control of the input level to the integrator. First, any 400ms blocks that are below -70LUFS are simply ignored. The average loudness of blocks that are above -70 LUFS is then calculated. A second average is then calculated, ignoring blocks that are 10LU below the first average. The first threshold closes the gate on digital silence and low-level noise from analogue circuits to prevent them from confusing the measurement. The second threshold, the -10LU one, closes the gate on ‘background’ as opposed to ‘foreground’ sound. It’s the foreground sound, not the sound of the wind in the trees, or a faint music bed, that people use to set their volume control, and the -10LU threshold blocks the background from the measurement.
The gates prevent the meter from including lots of silence or background sound in the integrated loudness measurement. Otherwise, quite a few programmes would show a misleadingly low loudness level, be normalised too high before broadcast, and then the viewer would reach for the volume control to turn it down.
If we were to measure Radio 3 with a non-gated meter and say it was specified that Radio 3 must have a loudness range no greater than 30LU, then it would sound like Kerrang FM for most of the time. Use an R.128 gated meter and again stipulate a loudness range of 30LU and it would sound much as it does now. This is because the R.128 meter is effectively ignoring all the short quiet sections (as the ear does), which would be included in the averaging process of a non-gated meter. Compression would need to be applied to the quiet sections to lift them if a 30LU loudness range were specified. This compression would have to be very fast and of a high ratio hence the Kerrang FM analogy.
This allows for very short silences that may be required to build drama in the programme, or as is often used in classical music it also takes in to account short duration noises or speech during these “silent” periods.
The difference in loudness measurement between gated R.128 based meters and non-gated C.A.L.M based meters is typically less than 2LU on long-term averages.
Dealing with peaks
More than ever we need to accurately measure the peak amplitude of the audio signal. The PPM doesn’t do this, and can be up to 8dBs low on very fast transients such as harpsichords and sibilant speech. Conventional bar graph meters do not tell the whole truth either.
Early digital peak meters merely took each binary sample value in turn and used them to light the LEDs in a column style display. A peak-hold LED may also have been included to show the highest signal passed during the last few seconds. However, these meters were only 95% accurate. They do not give the whole story when audio is converted from digits back to analogue, when audio is changed through some process such as EQ or other effects, or when data rate is reduced for transmission/storage in lower quality formats.
To find the true peak value created by these processes requires a meter to work at a sample rate of four times greater. So for audio sampled and distributed at 48kHz, the meter will re-sample its input at 192kHz.
It has been found that common audio processes such as conversion from PCM to a compressed format such as MP1 layer 3 or AAC can produce new peaks of 3 to 4 dB greater than those present in the input. This occurs because overshoots are created by distortions introduced by coarse quantisation in the coding process. Fast-rising waveforms that peak at or near zero dBFS (or even clip) will then be modified during processing or conversion to produce peaks greater than zero dBFS. Greater than zero dBFS is obviously not possible to represent in the digital domain, and a D-A converter might not have been designed with enough analogue headroom to reproduce the overshoots. Instead these samples will be clipped and produce distortion.
R.128 recommends a maximum true peak of -1dBTP during production, however the BBC and the DPP are recommending a true peak of slightly lower than this at -3dBTP to allow headroom for coding during transmission.Hence the need for the new true peak meter and its scale using the suffix dBTP.
The EBU does not describe a meter display fully, although the EBU does specify ranges for scales, integration times, and that units must be displayed. It is up to manufacturers to determine what the meters look like given the customer’s requirements, so there have been various display types developed. A simple example can be found within the free Broadcast Audio Production Tools, as seen below. This runs on PCs and is currently being ported across to the Raspberry Pi.
This display looks like the PPM but has a much slower needle; it can work at faster 4-second averaging, and also at the much longer 16-second averaging, where the needle takes a very long time to react. It does not perform gating so it is not strictly speaking an R.128 meter; however, it is very close when placed on the longer integration times.
As mentioned earlier, we also need to monitor the true peak of the signal, and the BAP tool is an excellent tool if you are 100% confident that the audio contains no peaks near -3dBTP. One Danish company make what they term a RADAR display, shown below.
The radar graph in the centre shows the loudness contour or history for the last 4 minutes, while the bar graph meters on the right show the true peak of the input channels (surround sound in this example). The loudness range, bottom left on the display, is not essential but very useful. This can be considered as the difference in LU between the highest and lowest loudness readings. There is a bit more to this reading which involves upper and lower limits. Values beyond these limits are ignored, as is the number of times in a programme these limits are exceeded.
Another Danish-produced loudness meter is a small unit which shows virtually the same information as the above radar display, just in a different format.
Finally, below is a proposed BBC design which looks quite different.
The blue block that is shown above the normal PPM scale responds to an R.128 measurement driver showing short term integration. When this blue indicator is above the reading of PPM4, this equates to -23LUFS or 0LU. The PPM needles still behave in the same way as every other PPM, and should be used by less experienced operators as a ‘confidence check’ to reassure them that the audio is present. The distance between 3 and 4 on the PPM scale still equals 4 dB for the PPM needles. However, for the blue block, this same gap equates to 2LU. This is the same for the 4 to 5 gap. So, when the white indicator strip within the blue block hovers slowly above 4 on the PPM scale, then the programme’s loudness equals 0LU (-23LUFS). If it moves either side of the 4 it is not necessarily a problem, as quite a wide variation in short term loudness is fine. It is the Integrated loudness of the whole programme that must hit the target of -0LU
The pink block is the true peak meter, which again operates over a different electrical scale compared to the PPM. Here, the PPM 6 marker will equate to - 3dBTP. The pink block moves every time the previous true peak value is exceeded, so is stationary for most of the time. If the pink block goes above 6 then the programme is likely to produce clipping once transmitted and returned to analogue in the home.
The advantage of this display is the lack of numbers and clutter, but it does take training and some familiarisation.