Allowing young children to explore natural history content by making animal sounds
Project from -
What we've done
We developed prototype software for a user-experience concept which allows young children to explore online content by making sounds with their voice.
Why it matters
Very young children are unable to navigate through BBC content online without the help of adults. Speech recognition technologies are improving and are able to provide an accessible control interface for those who cannot use a keyboard, but these systems do not work well for young children. Roar To Explore allows a child to find information about an animal by making the sound of that animal, for example to get pictures and videos of lions the child would roar into the microphone.
The objective was to build a prototype that proved the technical concept and gave an idea of the user experience.
How it works
Software was written to classify the sound recording of a child making an animal noise. A library of example animal sounds made by children was created and labelled with the correct animal. This was then analysed using VAMP audio analysis plug-ins to get a set of features which describe the range of sounds to expect for each animal. Each recording was represented by a Gaussian model of the Mel-frequency cepstral coefficient (MFCCs) distribution. Using this data a support vector machine (SVM) was trained to predict which animal is being impersonated. During the project we investigated the effect of training the model with data from a single user against the effect of using a generic model trained on data from many users. We also considered a variety of different audio signal features and classification algorithms, as well as effect of choosing different sets of animals on performance of the software.
A prototype web application and mobile application were built to demonstrate the technical concept and stimulate ideas about the user experience. Several algorithms were tested objectively and part of the work was published in a paper at the 132nd Convention of the Audio Engineering Society.
This project is part of the Immersive and Interactive Content section